libAtoms / abcd

1 stars 4 forks source link

abcd download-ed configs have elements JSON string that messes up gap fitting #105

Open gelzinyte opened 3 years ago

gelzinyte commented 3 years ago

xyz frames from abcd download contain an elements entry (e.g. elements="_JSON {\"6\": 2, \"8\": 1, \"1\": 6}") which messes with reading structures in gap_fit. The simplest solution might be to just not write elements when abcd-downloading? Especially because there's also a formula entry (formula=C2H6O).

gabor1 commented 3 years ago

hm... the XYZ formatting is in a bit of a flux right now. But some form of complex data type will be there (check out the #extxyz channel on our slack), so maybe we need to teach gap_fit to just treat that as a string and not worry about it.

gabor1 commented 3 years ago

how does gap_fit fail? there are already random string items in the xyz files that I use and they are fine. maybe it's the quoted quotes?

gelzinyte commented 3 years ago

Not sure if it's just the quoted quotes (maybe non-escaped {?), but it looks like the parsing of the info line gets messed up

$ head train.xyz
9
Lattice="50.0 0.0 0.0 0.0 50.0 0.0 0.0 0.0 50.0" Properties=species:S:1:pos:R:3:dft_forces:R:3 normal_mode_temperature=T dft_energy=-4212.509602197413 compound=ethanol n_atoms=9 config_type=ethanol_mol hash=50b506dbeef00fee5a23aceeb7ac0eb9 username=eg475 filename=train.xyz hash_structure=0154021c79357894b4136fd608fefad4 volume=124999.99999999991 mol_or_rad=mol dataset_type=train_compounds_train_configs normal_mode_energy=0.23389851991840674 formula=C2H6O elements="_JSON {\"6\": 2, \"8\": 1, \"1\": 6}" pbc="F F F"
C       -0.93401342       0.07042787       0.05111524       0.35621935      -0.69215077      -0.84583725
C        0.53526377      -0.43666892      -0.11341624      -1.69537623       1.01342896       1.25964530
O        1.47668645       0.40662443       0.51842773       0.09388185      -0.42907879       0.50304526
$ bash sub.sh
Wed 14 Apr 17:27:59 BST 2021
SYSTEM ABORT: Traceback (most recent call last)
File "/opt/womble/QUIP/git_repo/src/libAtoms/xyz.c", line 765 kind IO
Missing value for parameter "6 pbc"
gabor1 commented 3 years ago

yes, it looks like the fortran parser doesn’t cope with the escaped quotes.

-- Gábor

Gábor Csányi Professor of Molecular Modelling Engineering Laboratory, University of Cambridge Pembroke College Cambridge

Pembroke College supports CARA. A Lifeline to Academics at Risk. http://www.cara.ngo/

On 14 Apr 2021, at 17:51, gelzinyte @.***> wrote:

Not sure if it's just the quoted quotes (maybe non-escaped {?), but it looks like the parsing of the info line gets messed up

$ head train.xyz 9 Lattice="50.0 0.0 0.0 0.0 50.0 0.0 0.0 0.0 50.0" Properties=species:S:1:pos:R:3:dft_forces:R:3 normal_mode_temperature=T dft_energy=-4212.509602197413 compound=ethanol n_atoms=9 config_type=ethanol_mol hash=50b506dbeef00fee5a23aceeb7ac0eb9 username=eg475 filename=train.xyz hash_structure=0154021c79357894b4136fd608fefad4 volume=124999.99999999991 mol_or_rad=mol dataset_type=train_compounds_train_configs normal_mode_energy=0.23389851991840674 formula=C2H6O elements="_JSON {\"6\": 2, \"8\": 1, \"1\": 6}" pbc="F F F" C -0.93401342 0.07042787 0.05111524 0.35621935 -0.69215077 -0.84583725 C 0.53526377 -0.43666892 -0.11341624 -1.69537623 1.01342896 1.25964530 O 1.47668645 0.40662443 0.51842773 0.09388185 -0.42907879 0.50304526

$ bash sub.sh Wed 14 Apr 17:27:59 BST 2021 SYSTEM ABORT: Traceback (most recent call last) File "/opt/womble/QUIP/git_repo/src/libAtoms/xyz.c", line 765 kind IO Missing value for parameter "6 pbc"

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.