rs-station / reciprocalspaceship

Tools for exploring reciprocal space
https://rs-station.github.io/reciprocalspaceship/
MIT License
28 stars 12 forks source link

MTZ data set hierarchy? #103

Closed DHekstra closed 2 years ago

DHekstra commented 2 years ago

I was curious whether it's possible to output MTZ files that support assigning 'crystals' and/or 'projects' to columns (in the sense of https://www.ccp4.ac.uk/html/mtzformat.html). I ran into this issue when trying to run a data set through SCALEIT and it objecting that data sets scaled belonged to the same crystal. I circumvented the issue by outputting separate mtz files and merging them in CAD. My understanding is that GEMMI can handle project names and crystal names (https://gemmi.readthedocs.io/en/latest/hkl.html#mtz-format).

JBGreisman commented 2 years ago

This is something I've been considering implementing. It is supported in gemmi, so it wouldn't be challenging to at the very least allow custom naming of projects/crystals.

Right now, I construct a gemmi.Mtz with a dataset named "reciprocalspaceship": https://github.com/Hekstra-Lab/reciprocalspaceship/blob/f98b0dd0e9e665c76153e17837bd4ca123780f23/reciprocalspaceship/io/mtz.py#L103 I can add an attribute that allows one to customize that name.

However, I will note that I run scaleit pretty often and haven't had this issue, so I think we can also get around this issue with a change to your workflow.

JBGreisman commented 2 years ago

Just to expand a bit, this sort of support would only make sense specifically within DataSet.write_mtz() and DataSet.to_gemmi(). In my mind, maintaining this sort of hierarchy more broadly doesn't make sense because it is very specific to the MTZ data format, whereas rs.DataSet aims to more broadly support general reflection data.