electronsandstuff / easygdf

EasyGDF is a python interface to GDF files used in the particle tracking code GPT
BSD 3-Clause "New" or "Revised" License
5 stars 0 forks source link

Initial Distribution Integer IDs Fail Silently #5

Closed electronsandstuff closed 2 years ago

electronsandstuff commented 2 years ago

The following code using integer IDs saves an initial distribution file with an empty ID array.

easygdf.save_initial_distribution(distname, ... ID=np.arange(x.shape[0])+1)

It works correctly when called like the following which will pass an array of floats.

easygdf.save_initial_distribution(distname, ... ID=np.arange(x.shape[0])+1.0)

The expected behavior is that easygdf either saves the integers or converts to floats if GPT requires IDs in the distribution file to be floats.

electronsandstuff commented 2 years ago

Update: The issue is not that easygdf doesn't save the IDs. It does save them, but GPT does not read them in. gdf2a reports them as an invalid datatype.

Here is the output of gdf2a.

         x          y        GBx        GBy        GBz          t         ID          z 
 0.000e+00  0.000e+00  0.000e+00  0.000e+00  1.000e+02  0.000e+00 Invalid type 0.000e+00 
-3.000e-06  0.000e+00  0.000e+00  0.000e+00  1.000e+02  0.000e+00 Invalid type 0.000e+00 
-2.000e-06  0.000e+00  0.000e+00  0.000e+00  1.000e+02  0.000e+00 Invalid type 0.000e+00

Here is the output from gpt.

Warning: Environment variable OMP_WAIT_POLICY=PASSIVE not set.
gpt: /tmp/gptrunner_31965129272c4cd98dab5258f4b714f0.in(7): Error: Invalid length of "ID": Contains 0 points where 37 are required
electronsandstuff commented 2 years ago

By inspecting the output file and dropping breakpoints I was able to confirm that easygdf does correctly detect the datatype of the numpy array as an int64 and then outputs it as that to the file. I also was able to confirm from GPT documentation and headers that the ID easygdf is using for the int64 datatype is correct.

From what I can tell, easygdf is doing what's expected of it. However, there's still the issue of it creating files that gpt does not read in. This will confuse users and I think I should do something about it. I can see a couple of options.

I have to leave for the moment, but will look at it further in a second.

TODO:

electronsandstuff commented 2 years ago

OK, I ran a test by using gdf2a to open a file with every dtype in both array and single form. Here are the results:

Here are the results for single types.

(double)  0.000e+00
(float)   0.000e+00
(s8)              0
(s16)             0
(s32)             0
Type=80  Type=80
(u8)               0
(u16)              0
(u32)              0
Type=70  Type=70
(ascii)  deadbeef
(nul)       No data
(Undef)  No data

Here are the results for array types.

 (double)    (float)  (s8)  (s16)  (s32)       Type=80  (u8)  (u16)  (u32)       Type=70       (Undef) 
0.000e+00  0.000e+00     0      0      0  Invalid type     0      0      0  Invalid type  Invalid type
1.000e+00  1.000e+00     1      1      1  Invalid type     1      1      1  Invalid type  Invalid type

Interestingly, neither 64 bit integer types appear to be supported. They just get shown as an unrecognized type in single form and invalid in array form. If I run strings on the gdf2a binary I don't even see the values "(u64)" or "(s64)" in there. I think it really doesn't exist and that the constants in the header files were just never implemented.

The undefined datatype also does not produce a valid array.

electronsandstuff commented 2 years ago

Since gpt is really the only program using GDF files I think I will consider it as defining the standard and so if 64 bit integers are not a valid datatype then I don't think my program should emit them. The question is then what to do about 64 bit ints being passed to easygdf.