Jhsmit / PyHDX

Derive ΔG for single residues from HDX-MS data
http://pyhdx.readthedocs.io
MIT License
28 stars 12 forks source link

Refactor loading data #305

Closed Jhsmit closed 1 year ago

Jhsmit commented 1 year ago

Large refactoring of how data is loaded into the HDXMeasurement object.

Updated some of the column names in HDX Measurement data: Mapping from old to new column names is: "start" -> "_start" "end" -> "_stop" '_start'' -> "start" "_end" -> 'stop'

Additionally, replace spaces in column names with underscores.

Jhsmit commented 1 year ago

Hi @ococrook, if you have some time I was wondering if you could give some feedback on the new API and .yaml files as I've reworked them in this PR.

The new .yaml format is as shown here: https://github.com/Jhsmit/PyHDX/blob/refactor_data_loading/tests/test_data/input/data_states.yaml

The other main change is the removal of the PeptideMasterTable object. Implementation of this part as object-oriented was a poor design decision retrospectively, and I think the new functional approach is much better and makes it easier for users to directly interface with the HDXMeasurement object. An example of how to use these is here: https://github.com/Jhsmit/PyHDX/blob/a3838a3532d658dd16d50d738999706a694bbc61/templates/01_load_secb_data.py

There are still a few steps in the new procedure related to back-exchange correction that are not very clear in their current implementation and I plan to tackle those in the future.

I'll update the docs shortly.