stur86 / crystvis-js

A Three.js based crystallographic visualisation tool
https://stur86.github.io/crystvis-js/
MIT License
15 stars 7 forks source link

Fails to read Magres files with >100 atoms #26

Open ThatPerson opened 2 years ago

ThatPerson commented 2 years ago

When loading a Magres file with >100 atoms, the ms lines end up being like;

ms H100          2.0431056083497484E+01         -5.5892879764485039E+00          1.2418129913251554E+00         -4.0897852040395684E+00          2.9708843218858465E+01         -3.8875121992744055E+00         -1.1586267278454099E+00          5.0678781904684089E-01          2.4711193494436703E+01

e.g. the H and 100 are in the same block. In the following code (lines 150-155 in formats/magres.js), it splits the block by whitespace

        for (let i = 0; i < block.length; ++i) {
            let l = block[i];
            let lspl = _.trim(l).split(/\s+/);
            // Is it a 'units' line?
            if (lspl[0] == 'units') {
                let tag = lspl[1];

but for atoms where the atom label and number have concatenated this will only have 8 members and so the parseOneAtomLine will throw a Input matrix is not symmetric error.

stur86 commented 2 years ago

Is the issue here that the H symbol and the 100 number have no space? I am sorry but that is a problem with the writer, not the reader, and I can't do much about it. The symbol in the magres block is a label, not a chemical element - it can be anything (these are defined in the atom lines earlier). That means that for example H123 couldn't be parsed if I didn't split by space, because nothing would separate H 123 from H1 23. I can imagine some hackish solution, of course, but I think this really goes beyond what is reasonable to expect of a reader to guess, and would likely be liable to produce unpredictable output for other inputs. What writer produced this file?

ThatPerson commented 2 years ago

It was an output magres file from CASTEP 20.11, I guess maybe somewhere they're imagining that the format is defined by the character spacing and not the whitespace as in the magres spec? It only occurs in the ms/efg blocks, not in the atom block.

stur86 commented 2 years ago

Sounds like a typical Fortran format string issue... yes, I would say this needs addressing. I can bring it up in CASTEP. I don't think it's format compliant to have no separating space. See the original format specification:

The file is essentially a series of blocks consisting of rows constaining whitespace-delimited records, with datatype distinguished by tags in the first column.

It's very clear about white spaces being the delimiter. No relying on just character counts.