SBECK-github / Date-Manip

Other
10 stars 11 forks source link

Some Polish language abbreviations and names don't match Linux equivalents #41

Closed davidh2075 closed 1 year ago

davidh2075 commented 1 year ago

I've had some difficulty recognising Polish dates on Linux. It seems to be due to a difference between the abbreviations and one month name in the Date::Manip::Lang file polish.pm and those that Linux uses in date(1) and strftime(3).

I'm wondering if there is a reason for the current values being as they are, or if they could be updated to fix the parsing on Linux.

Weekdays The day abbreviations are currently

  day_abb => [
    ['po', 'po.', 'pon.', 'pon'],
    ['wt', 'wt.'],
    ['śr', 'śr.', 'sr.', 'sr'],
    ['cz', 'cz.', 'czw.', 'czw'],
    ['pi', 'pi.'],
    ['so', 'so.'],
    ['ni', 'ni.'],
  ],

When I test using those values, I get the following result, and the blank output lines are failed parses:

% perl -MDate::Manip -lpe 'Date_Init("Language=Polish", "DateFormat=non-US"); $_=UnixDate(ParseDate($_), "%Y%m%d %T")'
pon 26 wrz 13:11:28 2022 AEST
wto 27 wrz 14:11:28 2022 AEST
śro 28 wrz 15:11:28 2022 AEST
czw 29 wrz 16:11:28 2022 AEST
ptk 30 wrz 17:11:28 2022 AEST
sob 24 wrz 11:11:28 2022 AEST
ndz 25 wrz 12:11:28 2022 AEST
20220926 13:11:28

20220929 16:11:28

%

If I update the day abbreviations to these values to allow for the 3-letter abbreviations that Linux uses

  day_abb => [
    ['po', 'po.', 'pon.', 'pon'],
    ['wt', 'wt.', 'wto', 'wto.'],
    ['śr', 'śr.', 'sr.', 'sr', 'śro', 'śro.', 'sro.', 'sro'],
    ['cz', 'cz.', 'czw.', 'czw'],
    ['pi', 'pi.', 'ptk', 'ptk.'],
    ['so', 'so.', 'sob', 'sob.'],
    ['ni', 'ni.', 'ndz', 'ndz.'],
  ],

and rerun the test, all dates parse:

% perl -MDate::Manip -lpe 'Date_Init("Language=Polish", "DateFormat=non-US"); $_=UnixDate(ParseDate($_), "%Y%m%d %T")'    
pon 26 wrz 13:11:28 2022 AEST
wto 27 wrz 14:11:28 2022 AEST
śro 28 wrz 15:11:28 2022 AEST
czw 29 wrz 16:11:28 2022 AEST
ptk 30 wrz 17:11:28 2022 AEST
sob 24 wrz 11:11:28 2022 AEST
ndz 25 wrz 12:11:28 2022 AEST
20220926 13:11:28
20220927 14:11:28
20220928 15:11:28
20220929 16:11:28
20220930 17:11:28
20220924 11:11:28
20220925 12:11:28
%

Would those values be acceptable to add to the configuration?

Months The Polish month name for February is configured as 'luty'. Linux uses 'lutego', which is a declension form of luty.

Could 'lutego' be added as an alternative?

regards - David

SBECK-github commented 1 year ago

I have made those changes and they will be in the next release. If you want to check out the SBECK-github branch and verify that the changes work, that would be great.

SBECK-github commented 1 year ago

I'm going to close this. Please reopen if you see continued problems.

davidh2075 commented 1 year ago

Sorry for the tardy response. I did test the changes you made to the polish.pm file and they work. In testing it, I realised that I gave you the version from my testing on MacOS (Monterey, 12.5.1). So I've retested individually on MacOS and Linux and included all the language files in the test. I have changes to 8 files. Getting late here, so I'll merge the changes tomorrow and retest the merged changes on both OSs. The changes use the SBECK-github branch as the base.

regards - David