biocore / metagenomics_pooling_notebook

Jupyter notebooks to assist with sample processing
MIT License
8 stars 16 forks source link

Update rescale_counts.py #104

Closed justinshaffer closed 1 year ago

justinshaffer commented 1 year ago

Removed function to generate read count output; updated function for generating cell counts to generate cell counts per gram of input sample material

justinshaffer commented 1 year ago

Thanks! Yes I can confirm that the values will be < 1.

On Mon, Mar 27, 2023 at 8:47 PM Charles Cowart @.***> wrote:

@.**** requested changes on this pull request.

@antgonza https://github.com/antgonza Removed and modified functions both don't appear to be called by anything in the library, or in any of the sample notebooks.

In metapool/rescale_counts.py https://github.com/biocore/metagenomics_pooling_notebook/pull/104#discussion_r1149992295 :

 # Careful with types here.  If you use ints,

the length 650 10**9 can overflow integers with very long genomes

  • mult_row = (6.022 * (10.0 23)) / (lengths (650 10.0 9))
  • mult_row = (6.022 * (10.0 23)) / (lengths (650 10.0 9) * sample_weights)

@justinshaffer https://github.com/justinshaffer can you confirm that sample_weights will be less than 1? Based on the comment above the line, multiplying the line by another value larger than one will make it even easier for the denominator to overflow.

— Reply to this email directly, view it on GitHub https://github.com/biocore/metagenomics_pooling_notebook/pull/104#pullrequestreview-1360155588, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSDCGEIA7GEWIADABESRKDW6JNNXANCNFSM6AAAAAAWJ63LBE . You are receiving this because you were mentioned.Message ID: @.*** com>

-- Justin Shaffer, PhD Postdoctoral Researcher Rob Knight Group Department of Pediatrics, School of Medicine University of California, San Diego justinshafferbio.wordpress.com

antgonza commented 1 year ago

@charles-cowart; yup it should go to the dev branch until we put everything together.