biocore / metagenomics_pooling_notebook

Jupyter notebooks to assist with sample processing
MIT License
8 stars 16 forks source link

fixes most bugs #143

Closed charles-cowart closed 10 months ago

charles-cowart commented 1 year ago

This PR fixes most errors in testing that were occurring before and after integrating Justin's changes. Three errors remain: _test_abundance_read_count() test_read_count_to_cell_count() test_same_asR()

All with the same error message:

_raise KeyError(f"None of [{key}] are in the [{axisname}]") KeyError: "None of [Index(['G000005825', 'G000006605', 'G000006725', 'G000006745', 'G000006785',\n 'G000006845', 'G000006865', 'G000006925', 'G000006965', 'G000007105',\n ...\n 'G900155395', 'G900155405', 'G900155555', 'G900155635', 'G900155965',\n 'G900156305', 'G900156675', 'G900156765', 'G900156885', 'G900163845'],\n dtype='object', name='OTUID', length=3771)] are in the [index]"

It seems to be caused when attempting to merge two data frames, and the set of keys in one do not intersect at all with the keys in the other; it's like they're coming from two different datasets. Antonio and I think Justin might better at resolving this last issue than myself.

@antgonza @justinshaffer If you want to simply add to this branch or merge it to biocore:dev and add your changes I can help you.

antgonza commented 1 year ago

The test are not running cause you issue the PR to dev; could you change it to master?

charles-cowart commented 1 year ago

The test are not running cause you issue the PR to dev; could you change it to master?

You got it! It should merge into the current main without issue. The functions and files are all new, and the sample metadata files being used aren't an existing type with recent changes e.g.: sample-sheet, pre-prep-file.

charles-cowart commented 1 year ago

Update: Currently there is an issue with merging two different dataframes to generate the final output dataframe. The columns in metadata_samples_plasmid_sequences.txt do not intersect with those of the other two files used in the tests. It appears that the code and/or data to bridge this gap was not written. @antgonza is working on it based on feedback from Justin.

charles-cowart commented 11 months ago

Quick update: Sent Amanda B. an invite to the project. Amanda will be working on this issue.