tsegall / fta

Metadata/data identification Java library. Identifies Semantic Type information (e.g. Gender, Age, Color, Country,...). Extensive country/language support. Extensible via user-defined plugins. Comprehensive Profiling support.
Apache License 2.0
24 stars 2 forks source link

Getting DateTimeParseException while running FTA analysis on multiple column dataset #97

Closed PrateekDubey55 closed 2 months ago

PrateekDubey55 commented 2 months ago

We are using FTA version 15.7.3 for running analysis on the entire dataset, attached below.

While running the same on single column for Date we are not facing any issues, but for all the columns with multiple date formats we are encountering the issue.

PFA stack trace for reference.

I'm also attaching the logic we are using on our end to perform FTA analysis.

Please run the FTA analysis for the entire dataset, since we are facing issue on the same. FTA_Date_stackTrace.txt profile_e2e_customer_detail.csv ProfileAggregator.txt

PrateekDubey55 commented 2 months ago

@tsegall There are some more findings at my end,

  1. I've upgraded the FTA version to 15.7.5, still I'm getting the above error.
  2. When I'm trying with all columns dataset for the file named profile_e2e_customer_detail.csv (attached above), I'm getting matchTypeInfo -> type = LocalDate; but while running for single date column file named (single_date_data.csv), I'm getting matchTypeInfo -> type = String in class Facts.java for method calculateFacts()

I'm attaching screenshots of findings along with single date column file (single_date_data.csv)

All_Column_MatchTypeInfo_LocalDate Single_Date_Column_MatchTypeInfo

single_date_data.csv

tsegall commented 2 months ago

I believe the issue is addressed in 15.7.6 - please download and validate.

PrateekDubey55 commented 2 months ago

15.7.6 version helped us out, thanks!