Open Augustin-Zidek opened 2 months ago
Could it be that your components.cif file is compressed? What happens if you extract that file, the one in /var/cache/libcifpp, does that help?
You mentioned using zstd. That's a good suggestion, but the point is, when you use the bundled script to update components.cif it will write out a file uncompressed. Removing the need for decompression entirely.
As a reference, cif-validate on 7soy takes 0.2 seconds on my laptop:
$ time build/cif-validate /tmp/7soy.cif.gz
real 0m0,246s
user 0m0,239s
sys 0m0,007s
Hello, many thanks for the development and maintenance of libcifpp!
I've noticed that
cif::pdb::reconstruct_pdbx
is very slow. E.g. on7soy
mmCIF file from the PDB it takes < 0.2 seconds to parse, but runningcif::pdb::reconstruct_pdbx
on it takes roughly 4.5 seconds, i.e. a 20x slow-down if one wants to perform the correctness check/autofix.Vast majority of the time is spent in
cif::compound_factory::create
:Could that time be reduced? Also,
cif::compound_factory::create
seems to be called from multiple places. Would it make sense to cache that load?I think that this could also be sped up if the CCD was compressed using zstd instead of gzip, as it decompresses much faster.