Roestlab / massdash

MassDash: A web-based dashboard for streamlined DIA-MS visualization, analysis, prototyping, and optimization
https://massdash.streamlit.app/
BSD 3-Clause "New" or "Revised" License
16 stars 3 forks source link

[ADD] classes for different molecule levels #19

Closed singjc closed 11 months ago

singjc commented 11 months ago

I added precursor, product, peptide and protein classes under struct to represent different levels of a MS protein

jcharkow commented 11 months ago

I think it looks good. What would be the specific usage for these classes? Would they for the GUI more or should they be incorporated in file loading?

singjc commented 11 months ago

I think it looks good. What would be the specific usage for these classes? Would they for the GUI more or should they be incorporated in file loading?

It would be incorporated for the GUI, but for back-end logistics. Since we have a hierarchy of protein selection -> peptide selection -> precursor charge selection, this will keep them all linked including the product ions, and then PeakFeatures can get attached to it as well. What do you think?

singjc commented 11 months ago

I'm not too experienced with the GUI implementation you had in mind but I am a little worried we are a bit too "fine grain" with the classes. E.g. Would it be easier to store results in pandas dataframe for accessing purposes as this would be more storage efficient? Also I am a bit confused with the getter/setter methods. Are you planning on implementing error checking there? If so to make the codebase leaner would it be beneficial to adapt a package like param? https://param.holoviz.org/

I think I would be okay with just having a dataframe containing the Protein, peptide, precursor and product information. I agree this would be a lot easier to handle and access the data. I just wasn't sure how deep of the object-oriented refactoring you wanted to go into lol

For the getter and setters, they're used to make the attributes of the class mutable so that when information is available they can be set or updated to a new value. But if we agree to just have a dataframe representation of a transition list for a specific precursor, then we could just forgo these low-level structures

jcharkow commented 11 months ago

I think for now it is likely easier to forgo these lower level structures especially if use cases can be easily addressed with just the dataframes. I think for example chromatogram class is ok because we are loading a single chromatogram at a time. Or peakPicking we doing peak picking one transition at a time. But for the already tabular data, at least currently we can forgo the molecular structs.