rs-station / reciprocalspaceship

Tools for exploring reciprocal space
https://rs-station.github.io/reciprocalspaceship/
MIT License
28 stars 11 forks source link

support for read_precognition() for hkl without anomalous columns #200

Closed DHekstra closed 1 year ago

DHekstra commented 1 year ago

Test-20DC_off_3sig.hkl.txt

Under some circumstances, like in this hkl file provided by Vukica, Precognition uses five columns (h, k, l, FP, SIGFP) rather than 7. Currently, I get an error ParserError: Too many columns specified: expected 7 and found 5 when trying to read the attached file with rs.read_precognition. I am using rs version 1.0.0. I am currently using a workaround, but it isn't exactly tidy.

Perhaps just removing the usecols argument will do the trick?

dennisbrookner commented 1 year ago

Huh yeah I'm pretty surprised at this default behavior when providing more labels than there are columns, but neat, should be easy to make use of! (This is the same as here https://github.com/rs-station/reciprocalspaceship/blob/2d6fa0e676d02576af691373bc146fbc9191bda0/reciprocalspaceship/io/precognition.py#L31 minus the usecols argument).

Screenshot 2023-01-03 at 11 39 06 AM
JBGreisman commented 1 year ago

Just a note -- that HKL file may have been post-processed (for example with cut to remove the extra columns) or from an older version of Precognition (<5.0). According to the Precognition manual (see P. 129), .hkl files have 7 columns, and the extra columns are filled with 0 and 1, respectively, when the Friedel pairs are merged.

I'm ok with removing the usecols because this is a rather targeted user base, but it is there to ensure that the .hkl file is compatible with the Precognition output format.

In general, raw text formats can be read in using df = pd.read_csv() as @dennisbrookner did above, and can then be converted to a DataSet by passing the DataFrame to rs.DataSet(df), and setting the relevant arguments.

DHekstra commented 1 year ago

OK, fair enough--I am unaware of the history of Precognition's output choices. I'm fine with either approach. My workaround was already based on pd.read_csv and works.

DHekstra commented 1 year ago

Ready to close?

JBGreisman commented 1 year ago

Feel free to close if you're fine with the current state. If you think rs.read_precognition() should be a bit more flexible to support the 5-column .hkl format, I am happy to consider that further as well.

DHekstra commented 1 year ago

OK, will close. We can revisit this if it turns out that current versions of Precognition do still output five-column hkl files under some conditions.