JSBSim-Team / jsbsim

An open source flight dynamics & control software library
GNU Lesser General Public License v2.1
1.34k stars 449 forks source link

Support for run JSBSim instance in multi-thread environment #666

Open kemen209 opened 2 years ago

kemen209 commented 2 years ago

Describe the issue I'm a developer, and used JSBSim as a C++ static library for a quiet long time. Everything works like a charms, until i try to run mutilple JSBSim instance conurrently. It will crash sometimes. After some debugging, i found that the message queue in FGJSBBase in a static variable. The thread try to pop a empty queue which clear by other thread lead to this issue.

 static std::queue <Message> Messages;

So i'd like to request a feature to support run JSBSim in multi-thread env.

Why not use multi-process for isolation? The communication is hard and heavy for my situation.

Environment:

Other information same type issue as #201

seanmcleod commented 2 years ago

I'll take a look to see why the Messages queue is static and whether it can't be made non-static.

Although note even if we make it non-static that doesn't mean the JSBSim code is thread safe.

kemen209 commented 2 years ago

I create a temperary fix for this issues:

Methods

  1. Move the static variable(message, debug_level and other stuff) in FGJSBBase into a standalone struct CommonData.
  2. Add a reference of CommonData in FGJSBBase, and a API named ‘gdata()’ for accessing.
  3. And create a standalone CommonData instance, and put it in FGFDMExec, all others reference to this CommonData instance.
  4. Update all subclass of FGJSBBase to take the reference of CommonData in its constructor.
  5. NOTE: use reference instead pointer to avoid nullptr issue, and make it explicitly required.

Bypass

Issue Remain

kemen209 commented 2 years ago

Patch with unhandled static var, with comment 'FMPS'

diff --git a/src/initialization/FGTrimAnalysis.cpp b/src/initialization/FGTrimAnalysis.cpp
index 9d8d2220..2678c8af 100644
--- a/src/initialization/FGTrimAnalysis.cpp
+++ b/src/initialization/FGTrimAnalysis.cpp
@@ -1812,7 +1812,7 @@ double Objective::myCostFunctionFull(Vector<double> & x) // x variations come fr
       + 0.010*qDot*qDot
       + 0.010*rDot*rDot;

-    static int count = 0;
+    static int count = 0; // FMPS
     count++;

     if ( f < TrimAnalysis->GetCostFunctionValue() )
@@ -1974,7 +1974,7 @@ double Objective::myCostFunctionFullWingsLevel(Vector<double> & x) // x variatio
       + 0.010*qDot*qDot
       + 0.010*rDot*rDot;

-    static int count = 0;
+    static int count = 0; // FMPS
     count++;

     if ( f < TrimAnalysis->GetCostFunctionValue() )
@@ -2130,7 +2130,7 @@ double Objective::myCostFunctionLongitudinal(Vector<double> & x)
         0.010
         *qDot*qDot;

-    static int count = 0;
+    static int count = 0; // FMPS
     count++;

     if ( f < TrimAnalysis->GetCostFunctionValue() )
@@ -2298,7 +2298,7 @@ double Objective::myCostFunctionFullCoordinatedTurn(Vector<double> & x)
       + 0.010*qDot*qDot
       + 0.010*rDot*rDot;

-    static int count = 0;
+    static int count = 0; // FMPS
     count++;

     if ( f < TrimAnalysis->GetCostFunctionValue() )
@@ -2464,7 +2464,7 @@ double Objective::myCostFunctionFullTurn(Vector<double> & x)
       + 0.010*qDot*qDot
       + 0.010*rDot*rDot;

-    static int count = 0;
+    static int count = 0; // FMPS
     count++;

     if ( f < TrimAnalysis->GetCostFunctionValue() )
@@ -2622,7 +2622,7 @@ double Objective::myCostFunctionPullUp(Vector<double> & x)
       + 0.010*qDot*qDot
       + 0.010*rDot*rDot;

-    static int count = 0;
+    static int count = 0; // FMPS
     count++;

     if ( f < TrimAnalysis->GetCostFunctionValue() )

diff --git a/src/input_output/FGXMLElement.h b/src/input_output/FGXMLElement.h
index c53e2bf9..716d5712 100644
--- a/src/input_output/FGXMLElement.h
+++ b/src/input_output/FGXMLElement.h
@@ -394,8 +394,8 @@ private:
   std::string file_name;
   int line_number;
   typedef std::map <std::string, std::map <std::string, double> > tMapConvert;
-  static tMapConvert convert;
-  static bool converterIsInitialized;
+  static tMapConvert convert; // FMPS
+  static bool converterIsInitialized; // FMPS
 };

 } // namespace JSBSim

diff --git a/src/models/FGAtmosphere.h b/src/models/FGAtmosphere.h
index 132e4c3f..4ece620a 100644
--- a/src/models/FGAtmosphere.h
+++ b/src/models/FGAtmosphere.h
@@ -267,7 +267,7 @@ protected:
   */
   static constexpr double g0 = 9.80665 / fttom;
   /// Specific gas constant for air - ft*lbf/slug/R
-  static double Reng;
+  static double Reng; // FMPS
   //@}

   static constexpr double SHRatio = 1.4;

diff --git a/src/models/atmosphere/FGWinds.cpp b/src/models/atmosphere/FGWinds.cpp
index 736dddb8..e012bc12 100644
--- a/src/models/atmosphere/FGWinds.cpp
+++ b/src/models/atmosphere/FGWinds.cpp
@@ -285,7 +285,7 @@ void FGWinds::Turbulence(double h)
       xi_v_km1 = 0, xi_v_km2 = 0, nu_v_km1 = 0, nu_v_km2 = 0,
       xi_w_km1 = 0, xi_w_km2 = 0, nu_w_km1 = 0, nu_w_km2 = 0,
       xi_p_km1 = 0, nu_p_km1 = 0,
-      xi_q_km1 = 0, xi_r_km1 = 0;
+      xi_q_km1 = 0, xi_r_km1 = 0;  // FMPS

     double

diff --git a/src/simgear/magvar/coremag.cxx b/src/simgear/magvar/coremag.cxx
index 3954edaf..1fe4778c 100644
--- a/src/simgear/magvar/coremag.cxx
+++ b/src/simgear/magvar/coremag.cxx
@@ -158,15 +158,15 @@ static const double htnm_wmm2005[13][13]=

 static const int nmax = 12;

-static double P[13][13];
-static double DP[13][13];
-static double gnm[13][13];
-static double hnm[13][13];
-static double sm[13];
-static double cm[13];
+static double P[13][13]; // FMPS
+static double DP[13][13]; // FMPS
+static double gnm[13][13]; // FMPS
+static double hnm[13][13]; // FMPS
+static double sm[13]; // FMPS
+static double cm[13]; // FMPS

-static double root[13];
-static double roots[13][13][2];
+static double root[13]; // FMPS
+static double roots[13][13][2]; // FMPS

 /* Convert date to Julian day    1950-2049 */
 unsigned long int yymmdd_to_julian_days( int yy, int mm, int dd )
@@ -199,7 +199,7 @@ double calc_magvar( double lat, double lon, double h, long dat, double* field )
     double yearfrac,sr,r,theta,c,s,psi,fn,fn_0,B_r,B_theta,B_phi,X,Y,Z;
     double sinpsi, cospsi, inv_s;

-    static int been_here = 0;
+    static int been_here = 0; // FMPS

     double sinlat = sin(lat);
     double coslat = cos(lat);
bcoconni commented 2 years ago

Honestly, I don't know what Messageswas designed for. @jonsberndt, any idea ? I can only find a pair of references to the methods FGJSBBase::*Message: https://github.com/JSBSim-Team/jsbsim/blob/0e842e751373f70ac927cef87a68a6e8fd1129df/src/models/FGLGear.cpp#L546-L551 https://github.com/JSBSim-Team/jsbsim/blob/0e842e751373f70ac927cef87a68a6e8fd1129df/src/models/FGLGear.cpp#L563-L568

Both of these occurrences could be replaced by a simple call to cout and then we could get rid of the whole FGJSBBase::*Message thing.

bcoconni commented 2 years ago

I create a temporary fix for this issues:

Methods

  1. Move the static variable(message, debug_level and other stuff) in FGJSBBase into a standalone struct CommonData.
  2. Add a reference of CommonData in FGJSBBase, and a API named ‘gdata()’ for accessing.
  3. And create a standalone CommonData instance, and put it in FGFDMExec, all others reference to this CommonData instance.
  4. Update all subclass of FGJSBBase to take the reference of CommonData in its constructor.
  5. NOTE: use reference instead pointer to avoid nullptr issue, and make it explicitly required.

Not only this design is uglier than the static variables but in addition it breaks the API backward compatibility. This is definitely not the way to go.

Bypass

  • Directly remove some of debug log in following files, as it hard to get reference to 'debug_lvl':
src/input_output/FGPropertyManager.cpp
src/input_output/FGPropertyReader.cpp
src/input_output/FGXMLElement.cpp
src/math/FGFunction.cpp: bool GetBinary(double val, const string &ctxMsg)

The variable debug_lvl is pretty harmless and can't be inadvertently freed so what would be the point of removing it ?

  • Use std::normal_distribution to replace FGJSBBase::GaussianRandomNumber() in Element::DisperseValue(file: src/input_output/FGXMLElement.cpp). I’m not sure my replacement code is the right way to go. Since I’m not going to use ‘disperse’ feature, i will leave to it.

That's a good point.

bcoconni commented 2 years ago

You're perfectly correct in raising all these usage of the static keyword as needing some rework ! :+1: In most cases, the current design is bad and is a result of the days where C++98 was the most up to date standard

All in all, I'm pretty sure the static keyword has been used in JSBSim for a lot of bad reasons and that there exists ways to get rid of it with modern C++.

Sorry for having rejected your PR but I'd suggest that we address the topic using modern C++ features rather than using one big chunk of data that would store all sort of unrelated data and break the API backward compatibility in the process.

jonsberndt commented 2 years ago

Honestly, I don't know what Messageswas designed for. @jonsberndt, any idea ?

It’s been a long time since that code was written. However, I believe the code was simply to give an indication of whether the landing gear had made contact with the runway, and possibly if it had left the runway, too. There may be other ways to do this.

The FGJSBBase::Message term was supposed to provide a way for the various modules to communicate with each other, but it could be that the reasoning for that is no longer valid. It was a pretty early capability.

Could try to remove it and see what happens. A notification of landing gear contact could probably be done in a script using notifications.

kemen209 commented 2 years ago

Thanks for the quick responce on this.

Yes, the temporary fix is bad idea, and definitly not the right way to go. It's just a day saver for me at the moment.

I'm glad we are on the right track. Hope the right solution will coming out in the near future.

rverma-dev commented 1 year ago

Can someone list what are the relevant tasks for this, I want to pick one or two if possible.

seanmcleod commented 1 year ago

The specific issue with regards to the message queue was resolved via the following commit - https://github.com/JSBSim-Team/jsbsim/commit/0e3b0252342a0bef37d3d99fcf52117868394921

j-baker commented 3 months ago

I have another bug I'd like to report which would be under the same 'epic'. When I create a bunch of JSBSim simulations in parallel (within a single process), I commonly get a deserialisation error, which reads like...

libc++abi: terminating due to uncaught exception of type std::invalid_argument:
In file /path/to/config.xml: line 15

Supplied unit: "FT2" cannot be converted to FT2

which obviously seems suspicious. I assume this is due to some buffer being shared between multiple instances of the deserialiser?

I can workaround by mutexing creation of the simulator, but worry that there's some deeper down bug.

seanmcleod commented 3 months ago

The following doesn't appear thread safe.

https://github.com/JSBSim-Team/jsbsim/blob/e33e184f512449b8120769825b20e4d35df8ee51/src/input_output/FGXMLElement.h#L397-L398

So on line 60 below thread 1 sets converterIsInitialized = true and before it has completely initialised the converter map thread 2 executes, thinks the converter map is initialised and then tries to lookup an entry like FT2 and fails.

https://github.com/JSBSim-Team/jsbsim/blob/e33e184f512449b8120769825b20e4d35df8ee51/src/input_output/FGXMLElement.cpp#L52-L66