rs-station / reciprocalspaceship

Tools for exploring reciprocal space
https://rs-station.github.io/reciprocalspaceship/
MIT License
28 stars 12 forks source link

Raise ValueError when stack_anomalous() will result in duplicate column names #128

Closed dennisbrookner closed 2 years ago

dennisbrookner commented 2 years ago

This was pretty straightforward. Now, if you attempt to call DataSet.stack_anomalous() on a dataset containing columns such as I, I(+), and I(-), you are greeted with the error message:

ValueError: Stacking anomalous data will result in duplicate column names. Rename or drop column 'I' and try again.

I also added test_stack_anomalous_duplicates() to test_dataset_anomalous.py to verify this behavior.

Minor reordering

In the previous version of DataSet.stack_anomalous(), the line:

new_labels = [l.rstrip(suffixes[0]) for l in plus_labels]

comes after the following:

centrics = self.label_centrics()["CENTRIC"]
dataset_plus = self.drop(columns=list(minus_labels))
dataset_minus = self.loc[~centrics].drop(columns=list(plus_labels))
dataset_minus.apply_symop(gemmi.Op("-x,-y,-z"), inplace=True)

I was concerned that if I didn't check for bad labels until these four lines had already been run, that when the function exits with an error, the DataSet might still get modified without the user realizing? I'm not entirely sure if that's how things work, but in any event, I reordered things. Now, the labels are checked for potential duplicates before the DataSet is modified.

A random and annoying thing that I changed

Spyder astutely pointed out to me that several tests in test_dataset_anomalous.py contained things like:

with pytest.raises(ValueError):
    result = data_unmerged.unstack_anomalous()

but never actually access the result. I changed these instances to

with pytest.raises(ValueError):
    data_unmerged.unstack_anomalous()

etc. This is totally unrelated to this PR, so I'm happy to take this out if making this change here is annoying for code/version organization purposes.

codecov-commenter commented 2 years ago

Codecov Report

Merging #128 (624dddd) into main (1dd4e98) will increase coverage by 0.00%. The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #128   +/-   ##
=======================================
  Coverage   98.72%   98.72%           
=======================================
  Files          41       41           
  Lines        1571     1574    +3     
=======================================
+ Hits         1551     1554    +3     
  Misses         20       20           
Flag Coverage Δ
unittests 98.72% <100.00%> (+<0.01%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
reciprocalspaceship/dataset.py 98.07% <100.00%> (+0.01%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 1dd4e98...624dddd. Read the comment docs.