opensim-org / opensim-core

SimTK OpenSim C++ libraries and command-line applications, and Java/Python wrapping.
https://opensim.stanford.edu
Apache License 2.0
797 stars 321 forks source link

Locale settings are inconsistent between `TimeSeriesTable`/`Storage` and applications #3924

Open ksomml opened 1 month ago

ksomml commented 1 month ago

I am on a fresh Ubuntu 22.04 install and installed the latest OpenSim (main branch build).

When trying to load this test data test.mot through the OpenSim GUI onto a model:

inDegrees=yes
name=ik_example_imu_data_orientations
DataType=double
version=3
OpenSimVersion=4.5.1-2024-09-27-cbf9c74c0
endheader
time    pelvis_tilt
3   0.0
3.005   90.0

.. I am receiving this error:

[warning] Storage: FileAdpater failed to read data file.
Timestamp at row 0 with value 3,000000 is greater-than/equal to timestamp at row 1 with value 3,000000
    Thrown at TimeSeriesTable.h:533 in validateRow().

It seems to only accept the data if the time starts from 0. Also this does not work for me either:

...
time pelvis_tilt
0   0.0
0.1 90.0

Only if the timestamps are full seconds it seems to pass. Everything worked on my old system with OpenSim 4.5 install (code below should be identical to 4.5.1). Not sure how I can even debug this as I get the same results using the official examples from the OpenSim Modules --> Code --> Python --> OpenSenseExample

Whats the problem here?

Regarding code:

https://github.com/opensim-org/opensim-core/blob/c9336a887f6a1649b2cabbe3ae92d62d696e2955/OpenSim/Common/Storage.cpp#L205-L230

https://github.com/opensim-org/opensim-core/blob/cbf9c74c065194dd4ac2d350bbd580a09a575848/OpenSim/Common/TimeSeriesTable.h#L518-L537

ksomml commented 1 month ago

It turned out to be an issue with my locale settings. However, it's also a problem with OpenSim. The osim.IMUInverseKinematicsTool.run(visualizeTracking) function does not account for locale settings, exporting numerics with dots, while TimeSeriesTable.validateRow() seems to validate the data based on the system's locale (which in my case was set to de_DE, using commas instead of dots). This creates an unfavorable combination and caused this error.

There should definitely be a notice about configuring locale settings before using OpenSim, or the validation process should be adjusted to handle this scenario.

PS: I don't like commas for numerics either.

alexbeattie42 commented 1 month ago

I think it would be beneficial to define the behavior that should be supported (or expected). I can see two cases:

  1. OpenSim should read the data file based on the system locale (in this case for most European locales decimal points and commas are switched from the US version). I think this doesn't make sense if the file standard is defined to always contain a period to indicate the decimal point.
  2. OpenSim should read the data file always as if it was en_US.UTF-8 regardless of the local system locale. I think this is the more robust/probable answer

Which behavior would you be expecting?

ksomml commented 1 month ago

I am expecting behavior 2. OpenSim's OpenSense IK exports the .mot files with dots no matter what locale system settings, so reading the data again should also be expected with dots.

Either way, i believe noone who lives in a country which still uses commas for numerics (e.g. Germany) uses them in research. Standard are dots in my opinion.

alexbeattie42 commented 1 month ago

I agree that in research the commas are not used for numerics generally but the system locale is important for things like date and calendar display settings within the system and on the desktop.

I think I've figured out where the problem could be solved and found this article and this post very helpful.

Here is where the file stream is opened and on line 281 it is opened. So I believe you could do something like this to change the locale to en_US for that specific ifstream without impacting the system configuration.

alexbeattie42 commented 1 month ago

Also are you able to reopen this issue? It looks like the closing of the linked PR also closed the issue inadvertently.

ksomml commented 1 month ago

Also are you able to reopen this issue? It looks like the closing of the linked PR also closed the issue inadvertently.

I only closed the PR for the opensim-gui because I submitted two PR requests for the each linux build script (core+gui) and the current approach of fixing this issue seems to be to fix the root cause which lies in opensim-core. The PR for opensim-core is still open. Alternatively just changing LC_NUMERIC to en_us.UTF-8 might also fix the issue, though still not the root and it would probably be necessary that LC_ALL is not set.

ksomml commented 1 month ago

@alexbeattie42

Here is where the file stream is opened and on line 281 it is opened. So I believe you could do something like this to change the locale to en_US for that specific ifstream without impacting the system configuration.

Okay I see, simply instering this line into

fp->imbue(std::locale("en_US.UTF-8"));

..under this part of the Storage.cpp file https://github.com/opensim-org/opensim-core/blob/c9336a887f6a1649b2cabbe3ae92d62d696e2955/OpenSim/Common/Storage.cpp#L181-L186 ..could be a quick fix for this problem.

But before I would try that out by changing the source code, rebuilding OpenSim and seeing if it works with German locale settings, to me this issue seems to be a real root problem, so why not already imbue these settings in IO::OpenInputFile(fileName) of IO.cpp here:

https://github.com/opensim-org/opensim-core/blob/c9336a887f6a1649b2cabbe3ae92d62d696e2955/OpenSim/Common/IO.cpp#L430C1-L476C80

Any opinions on that?

alexbeattie42 commented 1 month ago

I think that is a great solution! I am still relatively new with this code base so I didn't get to that level of digging yet. Hopefully the maintainers could also chime in and I'm happy to test the change against the Finnish locale (where I'm currently located) if needed.

ksomml commented 1 month ago

Alright, i'll try it out the fix in IO.cpp first and check if these changes work. If so I will open a new PR for a clean commit history. Maybe that way some maintainers will have a deeper look into this as a change in IO.cpp could affect many things.

/edit: Actually I will try it out in Storage.cpp due to the reason mentioned.

alexbeattie42 commented 1 month ago

It looks like @adamkewley has used this solution in OpenSimCreator if you also want to check that for reference.

ksomml commented 1 month ago
   // it *writes* OSIM files using the locale, so you can end up with entries like:
   //
   //     <PathPoint_X>0,1323</PathPoint_X>
   //
   // but it *reads* OSIM files with the assumption that numbers will be in the format 'x.y'

This comment confuses me a bit because its the exact opposite of what i am experiencing.

I tried out the approach of inserting that one code line in that specific area of the Storage.cpp but it did not work as with this it seems to not load .osim files anymore. Though it could also be because i put everything to de_DE.UTF-8 now. So i believe the choice would be to try to just apply it to .mot files. Will keep looking for the right spot to insert it into.

adamkewley commented 1 month ago

iirc, I wrote the comment because we had a problem in prod where dutch students in TU Delft were ending up with incompatible data files when they were shared with the english speakers. I probably only tested it in a very specific situation, and it was probably borked in that specific way. It wouldn't surprise me if the same bug also happens for reading via different APIs. It stems from an inconsistent use of raw file writers vs. formatted file writers (e.g. FILE write() vs. std::ostream).

@alexbeattie42 , I'm actually feeling quite seen ^_^ - I think you're probably one of the first people to read OSC's source 😄

ksomml commented 1 month ago

iirc, I wrote the comment because we had a problem in prod where dutch students in TU Delft were ending up with incompatible data files when they were shared with the english speakers.

Yeah, the problem I am experiencing is essentially the same, though the other way around. The OpenSense IK exports with dots only and the import wants it with commas.

Also I noticed that the build scripts could use a rework, especially having two separate build scripts for Core and GUI would make sense, but not how it is currently implemented. The GUI build script also builds OpenSim core from source but does not have the Python tests in it. Why? The GUI gets installed properly into the /opt/ but the core does not, its just installed right at the home directory, why? Everything seems a bit unorganized. In my opinion it would be best to open a new repository under opensim called opensim-build-scripts and have a unified script where you can easily choose if you want core, gui or both installed.

Either way, when I tried out the fix, I executed the OpenSim core script without checking (for some reason I was hoping that it would build my local modified source code, but it always pulls the latest main branch before building) and now I can't load models in the OpenSim GUI anymore, which is weird to me because essentially I just installed OpenSim again and nothing should've changed.

Will post the exception I am getting inside the GUI when trying to import an .osim file as it might be connected to this problem (?) :

java.lang.RuntimeException: SimTK Exception thrown at MassProperties.h:542:
  Error detected by Simbody method Inertia::operator-=(): Diagonals of an Inertia matrix must satisfy the triangle inequality; got 0.0001,0.0002,0.001.
  (Required condition 'Ixx+Iyy+Slop>=Izz && Ixx+Izz+Slop>=Iyy && Iyy+Izz+Slop>=Ixx' was not met.)

    at org.opensim.modeling.opensimSimulationJNI.Model_initSystem(Native Method)
    at org.opensim.modeling.Model.initSystem(Model.java:1020)
    at org.opensim.view.pub.OpenSimDB.addModel(OpenSimDB.java:146)
    at org.opensim.view.pub.OpenSimDB.addModel(OpenSimDB.java:141)
    at org.opensim.view.FileOpenOsimModelAction$2.run(FileOpenOsimModelAction.java:100)
    at java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:311)
    at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:758)
    at java.awt.EventQueue.access$500(EventQueue.java:97)
    at java.awt.EventQueue$3.run(EventQueue.java:709)
    at java.awt.EventQueue$3.run(EventQueue.java:703)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:74)
    at java.awt.EventQueue.dispatchEvent(EventQueue.java:728)
    at org.netbeans.core.TimableEventQueue.dispatchEvent(TimableEventQueue.java:136)
[catch] at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:205)
    at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:116)
    at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:105)
    at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:101)
    at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:93)
    at java.awt.EventDispatchThread.run(EventDispatchThread.java:82)

Either way, I will need to solve this first before I can try the fix for the current problem again.

adamkewley commented 1 month ago

That problem usually pops up when building in Debug mode. As far as I'm aware, opensim-core really only works when built in non-Debug modes (RelWithDebInfo works, though).

ksomml commented 1 month ago

I did not build in debug mode. Just used the standard execution without any parameters.

aymanhab commented 1 month ago

In terms of expected behavior, I think we're all in agreement that files should be written/read in US format (with decimal point). Accordingly we should plug any loopholes/cracks that end up writing files in a different format. We never run into this locally because our machines typically have US locale. We just need to use a common solution to enforce this for consistency.

alexbeattie42 commented 1 month ago

iirc, I wrote the comment because we had a problem in prod where dutch students in TU Delft were ending up with incompatible data files when they were shared with the english speakers. I probably only tested it in a very specific situation, and it was probably borked in that specific way. It wouldn't surprise me if the same bug also happens for reading via different APIs. It stems from an inconsistent use of raw file writers vs. formatted file writers (e.g. FILE write() vs. std::ostream).

It seems like actually fixing the issue would first involve removing usages of FILE* (this is in a c++ class so it does not make sense to use C file IO) and subsequent usages of IO::OpenFile. It is mainly only used here and here. It seems like both of those can be replaced with IO::OpenInputFile and IO::OpenOutputFile. Further usages of FILE * , for example here would need to be written to use std::fstream.

Once that is complete and there is only one place where the file IO happens then the stream can be imbued with the correct locale in the OpenInputFile and OpenOutputFile methods and things would work as expected.

nickbianco commented 1 month ago

Just getting up to speed on this after being in limited capacity over the last two weeks.

@alexbeattie42 that seems like a reasonable solution. I'm not sure when @aymanhab or I will get around to a fix. If you or @ksomml want to give this a shot (which might make sense since you're both outside the US) I'd be happy to review a PR.

As for the build scripts, in their current form I'd say they're currently most useful as an entry point for new developers. I would recommend using them as a reference for creating your own build scripts or IDE project settings (e.g., setting CMake variables via VS Code workspace settings). But I agree we should have better options for contributors to quickly get up to speed (the discussion on https://github.com/opensim-org/opensim-core/issues/3928 is a good start).

alexbeattie42 commented 1 month ago

@nickbianco @ksomml I have created a minimum reproduction of the issue here. The example sets the locale locally in the code (to Finnish but any locale which uses commas for numerics would work) so you do not have to modify your system locale at all to replicate the issue.

Now that I can reliably reproduce the issue I'm going to starting working on the fixes described above.