drewnoakes / metadata-extractor

Extracts Exif, IPTC, XMP, ICC and other metadata from image, video and audio files
Apache License 2.0
2.53k stars 473 forks source link

Regression Tests producing different data #626

Open TSGames opened 10 months ago

TSGames commented 10 months ago

When running the regression test (Windows 11 JVM, German OS + German Timezone - please also see PR #625 ) I'm getting a diff in multiple files, especially comments or special data. It looks like an issue with file encoding, but I can't figure out what is causing it.

Some examples: image image

Any help/assistance would be appreciated!

drewnoakes commented 10 months ago

Are those diffs in the Java files or the .NET ones?

For Java, perhaps we need something like Locale.setDefault(...) to force a particular culture for consistency across those files.

For .NET it'd be Thread.CurrentThread.CurrentCulture or something like that.

drewnoakes commented 10 months ago

For Java, perhaps we need something like Locale.setDefault(...) to force a particular culture for consistency across those files.

Ug I hadn't had my coffee --that's exactly what your PR that I approved yesterday does :)

TSGames commented 10 months ago

Yes, that solved the issues regarding number formatting. I tried setting default encoding to utf 8 but that doesn't made a difference (I guess that is default on java anyways)

Can you tell me on which os and locale you're running these tests? If nothing helps I would configure an environment m

drewnoakes commented 10 months ago

I'm on Windows 11, running on en-AU.

I've made some progress on date formatting, by converting everything to UTC in the output files and will push that up later today hopefully.

Were you seeing diffs in the .NET files or the Java ones?

drewnoakes commented 10 months ago

BTW you can workaround the issue here by running the suite without your changes and staging the diff, then running with your changes and comparing the workspace with the index.

TSGames commented 10 months ago

I'm on Windows 11, running on en-AU.

Alright, I just wanted to make sure it has nothing to Do with linux/windows encoding.

Will then probably go the git staging route to compare the cr3 PR changes.

drewnoakes commented 10 months ago

That probably makes the most sense for now. I can re-run your branch against the regression set too, as an extra check.