Closed jalew188 closed 2 years ago
Hi, overall docstrings improved quite a bit.
I think we can still improve on the Notebooks, e.g. with a Small header on each notebook that explains what can be found there. E.g. I would not for what calc_precursor_isotope
is being used. Why is it only giving two intensities? Maybe we could start drawing a graphical representation of the classes/ functions of the public methods as sort of an API documentation.
Test consitency: There is often a mix and match for the tests, sometimes it is within a def test_ function, sometimes it is just asserts in a hdiden function.
General other comments (maybe out of the scope for this PR) to consider?
Perspective: In principle we have most of the fragment calculation in there. When would be a good time to replace the respective codebase in AlphaPept for this?
Hi, overall docstrings improved quite a bit.
I think we can still improve on the Notebooks, e.g. with a Small header on each notebook that explains what can be found there. E.g. I would not for what
calc_precursor_isotope
is being used. Why is it only giving two intensities? Maybe we could start drawing a graphical representation of the classes/ functions of the public methods as sort of an API documentation.
Thank you for quick reviewing. There may be more docs in notebooks before nbdev_build_docs
.
For calc_precursor_isotope
, Sander asked to add isotope 1 into the DIA library, so I think give him 2 isotopes will be more convenient for him. It makes sense as sometimes, M1 or even M2 is more abundance than M0 (mono).
I did not think too much on this API, we can of course provide more isotopes.
Test consitency: There is often a mix and match for the tests, sometimes it is within a def test_ function, sometimes it is just asserts in a hdiden function.
Yes, legacy problems. After writing some test functions, I thought if we use nbs, we don't need test functions...
General other comments (maybe out of the scope for this PR) to consider?
- For some methods I am not sure how they will scale, e.g. what if I have a very large FASTA, maybe we can have maximum sizes?
Good points. Generating Fasta library online requires computational powers, maybe it is better to use pre-build libraries for users. It not easy to control the size, for example, if we add phospho into a library, the library will be nearly 3-10 times larger than unmodified libraries. And the size is not sure for other PTMs...
- When using high-level objects such as precursor_df, we could raise errors for the API functions when they do not have respective attributes. This would help a lot calling them.
Yes, as precursor_df's life time is as long as the software execution time, it must be checked before we use it. I am still not sure where is the best place to do the checks. But we can always catch the 'KeyError' or 'AttributeError' raised by pandas when using it.
Perspective: In principle we have most of the fragment calculation in there. When would be a good time to replace the respective codebase in AlphaPept for this?
Could do very soon:)
After this PR, we may not have big PRs for alphabase unless we have more common features to add, for example: fasta/protein processing, and Sander's AlphaUtils/Performance codes.
So no hurries to review this PR, I will merge this PR until we want to officially release AlphaBase @swillems @straussmaximilian