drewnoakes / metadata-extractor-images

Database of images from various digital cameras
https://drewnoakes.com/code/exif/
103 stars 45 forks source link

Regression tests should not use machine's locale #35

Open drewnoakes opened 7 years ago

drewnoakes commented 7 years ago

Developers currently cannot run the regression tests in locales that format things differently to en-GB (and some other English-language locales). This was identified by @Nadahar in drewnoakes/metadata-extractor#233.

The regression test program must call global setLocale(Locale.ROOT) before producing output.

Locale.ROOT is a neutral, portable locale across JVMs.

This should be compared with the equivalent issue in the .NET project.

rdvdijk commented 3 years ago

Another approach could be to make the Locale configurable, has that ever been considered? Instead of depending on a globally set locale, all number formatting and date formatting would use an explicitly configured one.

I do understand that this would mean that the API of ImageMetadataReader (and related JpegMetadataReader and JpegReader, and many, many other classes) would need to change to propagate such configuration to the right places.

drewnoakes commented 3 years ago

That would be preferable, for sure.

This issue is mostly tracking the fact that the regression tests data set breaks if you run it in a different locale to me (i.e. not en-GB).

Do you have a use case for configuring the locale?

rdvdijk commented 3 years ago

We use metadata-extractor in a distributed environment, setting the Locale globally has undesired side-effects. One of our clients apparently configured a different global Locale, which resulted in unexpected behavior (in this particular case: longitude/latitude formatting resulted in 'unparsable' values downstream).

We see two solutions: Tell our client to use the en-GB or en-US locale, or somehow pass a fixed Locale to metadata-extractor. The latter is the reason why I'm here :wink:

drewnoakes commented 3 years ago

Specifying a locale across the library will be a big task and is unlikely to happen soon, though I agree it's the right thing to do.

For your specific scenario, we could override some methods for the specific tag you're looking at. Alternatively you can format the strings yourself directly from the underlying data. Lat/lng is stored as numbers internally, so you could add your own locale aware formatting without needing library changes.

Are you using the Java or .NET version? I know .NET lets you set the locale on a given thread, which would also address your issue.

rdvdijk commented 3 years ago

We're using the Java version. Thanks for the tip, we'll look into accessing the underlying data.

I'll also to explore this idea of passing the locale across the library. If I think it would be a feasible approach I'll send a pull request to discuss the details.