libAtoms / extxyz

Extended XYZ specification and parsers
MIT License
12 stars 4 forks source link

C example reads the whole 2nd line as a comment #20

Open stenczelt opened 2 weeks ago

stenczelt commented 2 weeks ago

Building the C example (ARM Mac, clang15, #19 resolved with hard-coding local paths) compiles, but reads the 2nd line of a file entirely as the comment key of the info dict. I'd expect it to read the Lattice and other keys separately.

example.xyz:

1
Lattice="0.012380693021800743 1.714399199949794 1.714399199949794 1.7192339837072965 0.0075459092642982985 1.7192339837072965 1.7090956692585484 1.7090956692585484 0.017684223713046465" Properties=species:S:1:pos:R:3 generated_from_mp_id=mp-111 data_gen_compression_percent=50.0 data_gen_shear="1.0 0.0006564438167393784 0.00659231380053108 0.0006564438167393784 1.0 0.0037616018582884514 0.00659231380053108 0.0037616018582884514 1.0" data_gen_timestamp=2024-05-05T14:23:15.005747 pbc="T T T"
Ne      -0.00623274       0.00041326      -0.02325031

running ./cextxyz example.xyz T

parsed success 1
nat 1
info
key 'comment' type 4 shape 0 0
arrays
key 'species' type 4 shape 0 1
key 'pos' type 2 shape 1 3
1
comment="Lattice=\"0.012380693021800743 1.714399199949794 1.714399199949794 1.7192339837072965 0.0075459092642982985 1.7192339837072965 1.7090956692585484 1.7090956692585484 0.017684223713046465\" Properties=species:S:1:pos:R:3 generated_from_mp_id=mp-111 data_gen_compression_percent=50.0 data_gen_shear=\"1.0 0.0006564438167393784 0.00659231380053108 0.0006564438167393784 1.0 0.0037616018582884514 0.00659231380053108 0.0037616018582884514 1.0\" data_gen_timestamp=2024-05-05T14:23:15.005747 pbc=\"T T T\"" Properties=species:S:1:pos:R:3
Ne        -0.00623274       0.00041326      -0.02325031
written err_stat 0
jameskermode commented 2 weeks ago

This change is probably related to how you added the extra args in #19, since I recall the new interface provides functionality which can optionally read the whole comment line into a string rather than an info dict. This is for compatibilty with plain (i.e. not extended) XYZ files. If you show me the changes you made to test_C_main() I'm sure we can resolve this.

stenczelt commented 2 weeks ago

see #21 with the changes now

stenczelt commented 2 weeks ago

Here's another peculiarity: If I shorten this file a little, the the same program ends with a segfault:

1
Lattice="0.01 1.7 1.71 1.72 0.0 1.73 1.70 1.74 0.02" Properties=species:S:1:pos:R:3:x:R:1 hello="world" timestamp=2024-05-05T14:23:15.005747 pbc="T T T"
Ne      -0.01 0.1 -0.02   0.123

However, ASE can read this and it also looks reasonable to me as an XYZ file.

Tried on MacOS and using ubuntu/gcc-13 in Docker as well:

$ ./libextxyz/cextxyz example.xyz T

parsed success 0
nat 1380930130
info
[1]    12001 segmentation fault  ./libextxyz/cextxyz example.xyz T
jameskermode commented 2 weeks ago

File looks OK to me too so that looks like a bug. Can you post a full stacktrace? Will require compling with debugging symbols (-g in CFLAGS).

stenczelt commented 2 weeks ago

I've tried adding -Og -g3 to the CFLAGS in libextxyz/Makefile but that seems not to have made a difference... Am I doing something incorrectly, or perhaps the fault happens in a dependency which is not compiled with -g here?

(venv) ➜  libextxyz git:(master) ✗ make cextxyz
cc -g -Og -g3  -I../libcleri/inc -g -c test_C_main.c -o test_C_main.o
cc -g -Og -g3  -I../libcleri/inc -g -c extxyz.c -o extxyz.o
cc -g -Og -g3  -I../libcleri/inc -g -c extxyz_kv_grammar.c -o extxyz_kv_grammar.o
gfortran -g test_C_main.o extxyz.o extxyz_kv_grammar.o -o cextxyz ../libcleri/Release/libcleri.a  -lpcre2-8  -L../libcleri/Release
(venv) ➜  libextxyz git:(master) ✗ ./cextxyz b.xyz T
parsed success 0
nat 1380930130
info
[1]    8804 segmentation fault  ./cextxyz b.xyz T