Closed GeorgWa closed 8 months ago
Looks nice
I discussed the topic a bit with @mo-sameh and we reasoned that there is currently some ambiguity between a SpecLibBase
and SpecLibFlat
. As the SpecLibFlat
is inherited and produced by a SpecLibBase
it can still have dense representations (fragment_mz_df
, fragment_intensity_df
etc.) and inherited functions like remove_unused_fragments
which don't operate on the flat representation. This is confusing as one would expect they affect your flat library.
Therefore I made some small changes to this PR:
available_fragment_dfs
to available_dense_fragment_dfs
to make clear this only returns dense fragments matrices.SpecLibFlat.available_dense_fragment_dfs
to make clear that dense representations are not supported as part of a SpecLibFlat.SpecLibBase.parse_base_library
is called. If keep_original_frag_dfs
is set to true, a deprecation warning is issued.SpecLibFlat.remove_unused_fragments
is called.I think to handle complexity we have to keep the different library types better separated.
Looking forward to hear your thoughts
Check out this pull request on
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
We should use SpecLibBase as the composite instead of the parent class.
The save/load_hdf functions in flat.py are still using the dense DataFrames. I agree that there's a need for refactoring the class structures and their relationships. Apart from that, for this PR everything looks good to me.
I agree, or we should have a base class which only handles precursor operations and have this as parent class for SpecLibBase and SpecLibFlat. I think this would be a separate bigger refactoring involving Magnus.
Yes, let's keep as it is now. You can merge I think
SpecLibBase.available_fragment_dfs()
would return['_fragment_df', '_fragment_intensity_df', '_fragment_mz_df']
on a SpecLibFlat and fail on.remove_unused_fragments()
.This makes sure it's really matching
_fragment_[attribute_name]_df
.