platonai / PulsarRPA

Automate webpages at scale, scrape web data completely and accurately with high performance, distributed AI-RPA.
Apache License 2.0
778 stars 118 forks source link

natural language date parser #72

Open platonai opened 4 months ago

platonai commented 4 months ago

We need parse dates from natural language text:

String text = "Let's meet next Friday for coffee.";
Parser parser = new Parser();
List<DateGroup> groups = parser.parse(text);

Popular Libraries:

Natty: A widely-used library that handles a variety of date formats, including relative dates ("next Tuesday"), formal dates ("2023-07-20"), and relaxed expressions ("the first Monday of April 2000"). Hawking: Developed by Zoho, Hawking is another powerful option that excels at context understanding. It considers the tense of the sentence and can extract multiple dates from a single input.

import com.joestelmach.natty.DateGroup;
import com.joestelmach.natty.Parser;
import java.util.List;

public class DateParserExample {
    public static void main(String[] args) {
        String text = "Let's meet next Friday for coffee.";
        Parser parser = new Parser();
        List<DateGroup> groups = parser.parse(text);

        if (!groups.isEmpty()) {
            DateGroup group = groups.get(0);
            System.out.println("Parsed date: " + group.getDates().get(0));
        } else {
            System.out.println("No dates found.");
        }
    }
}

Key considerations:

Accuracy: While these libraries are quite accurate, they might struggle with ambiguous expressions or domain-specific language. Context: Hawking particularly shines at considering the context of the date expression, leading to more accurate results. Dependencies: Remember to include the necessary library dependencies in your project.