Closed simonenkoi closed 2 years ago
I think you can validate correctness by using the next piece of code:
Arrays
.stream(Locale.getAvailableLocales())
.forEach(locale -> {
var value = "02.02.2020";
var dateTimeParser = new DateTimeParser()
.withDateResolutionMode(DateTimeParser.DateResolutionMode.Auto)
.withLocale(locale);
dateTimeParser.train(value);
var result = dateTimeParser.getResult();
var fmt = ((SimpleDateFormat) DateFormat.getDateInstance(DateFormat.LONG, locale));
System.out.printf(
"Locale: %s, format: %s, should be %s%n",
locale,
result.getFormatString(),
getFormatBasedOnFirstCharacter(fmt.toPattern())
);
});
public static String getFormatBasedOnFirstCharacter(String str) {
for (char ch : str.toCharArray()) {
//day first
if (ch == 'd') {
return "dd.MM.yyyy";
}
//month first
if (ch == 'M') {
return "MM.dd.yyyy";
}
}
return null;
}
Also, some locales use the YMD pattern, so the "02.02.02" date should be identified as "yy.MM.dd" for the following code example:
List
.of(
new Locale("nds"),
new Locale("bo"),
new Locale("lv"),
new Locale("zh"),
new Locale("vo"),
new Locale("dz"),
new Locale("sah"),
new Locale("ml"),
new Locale("mn"),
new Locale("ja"),
new Locale("my")
)
.forEach(locale -> {
var value = "02.02.02";
var dateTimeParser = new DateTimeParser()
.withDateResolutionMode(DateTimeParser.DateResolutionMode.Auto)
.withLocale(locale);
dateTimeParser.train(value);
var result = dateTimeParser.getResult();
System.out.printf(
"Locale: %s, format: %s",
locale,
result.getFormatString());
});
I can create a separate issue for it; it doesn't affect me as an end-user because I currently don't support these locales.
The core issue is addressed in 9.0.18. The issue arose because fta-core (the date processing) is mostly invoked from fta (the semantic tagging and profiling module), which set the mode to day first or month first based on the locale. What version of Java were you using to see the YMD pattern?
List
.of(
new Locale("nds"),
new Locale("bo"),
new Locale("lv"),
new Locale("zh"),
new Locale("vo"),
new Locale("dz"),
new Locale("sah"),
new Locale("ml"),
new Locale("mn"),
new Locale("ja"),
new Locale("my")
)
.forEach(locale -> {
var fmt = ((SimpleDateFormat) DateFormat.getDateInstance(DateFormat.LONG, locale));
System.out.printf("Locale: %s, format: %s%n", locale, fmt.toPattern());
});
Java 11.0.14.1 OpenJDK
Validated the core issue; works as expected in 9.0.18. Thank you!
Code example to reproduce:
Expected behavior: The format is either "MM.dd.yyyy" or "dd.MM.yyyy" depending on locale
Actual behavior: All locales return the "MM.dd.yyyy" format
FTA version: 9.0.17