rs-station / rs-booster

Useful scripts for analyzing diffraction
https://rs-station.github.io/rs-booster/
MIT License
3 stars 4 forks source link

Adding a background subtraction script #19

Open alisiafadini opened 2 years ago

alisiafadini commented 2 years ago

The script I have essentially takes 2 mtzs (light and dark), a reference PDB, and XYZ coordinates for a region of interest. It returns a background subtracted map (similar to the PanDDA one) where the subtraction in done in reciprocal space ( |F1| - N*|F2| ) rather than with 2mFo-DFc maps. Option to weight the differences as usual for now. Was thinking it makes sense to keep this as its own script for now but open to suggestions.

kmdalton commented 2 years ago

What would the alternative be? Building this into the difference map script?

alisiafadini commented 2 years ago

Best alternative could be making it a function that can be called from any script. May work better as its own script

JBGreisman commented 2 years ago

I'm a bit confused here by the XYZ coordinates -- is that just used to generate a mask for the output map? Where does that factor in if the subtraction is handled in reciprocal space?

This seems very close to rs.diffmap with the addition of an N term for weighting the dark dataset. I could see it being added as a flag to that script, or as its own. Right now my inclination is to start as its own script. It's somewhere conceptually between rs.diffmap and rs.extrapolate (which handles generating extrapolated structure factor for refinement of excited states).

alisiafadini commented 2 years ago

@JBGreisman the background subtraction is done in reciprocal space, but the screening for the right value is done in real space (you try to maximize the difference in CC between a local region e.g. chromophore and the rest of the protein). Whichever background subtraction value is found to maximize this is used for the reciprocal space difference. I can share a figure if easier?

Because of this extra part of the analysis it's a little more complex than just adding an N option to diffmap I guess. It's probably closer conceptually to rs.extrapolate

JBGreisman commented 2 years ago

got it -- that makes sense to me. I agree this should be its own script, and we can add a figure as part of an example/documentation.

alisiafadini commented 2 years ago

I like the rocket emoji – and sounds good. On that note actually, what's your policy on plotting/graphics (not sure if this should be another issue). I have some plotting sections in my script that saves things like the screening of the regularization parameter for inspection after running the script. I was seeing that matplotlib and seaborn are not required dependencies for the rs install – so maybe plotting should be optional or removed?

JBGreisman commented 2 years ago

I'm fine with plotting/graphics -- matplotlib and seaborn are requirements already for rsbooster (but you're correct that they are not requirements in rs itself).

https://github.com/Hekstra-Lab/rs-booster/blob/ecd60451db3e459fbe84f3c27469cbcd7fe5f835/setup.py#L38

Some of the methods in the stats submodule open plots. I think you can either open an interactive plot using something like plt.show(), or you can take an output filename for writing a plot. Whatever seems best for your use case.

alisiafadini commented 2 years ago

Ah great – yeah sorry I was looking at rs instead of rs-booster. I think I would keep the plotting sections in my scripts but maybe make them optional