Closed cguccione closed 2 years ago
I'm not sure why I am failing some of these checks still, does anyone have an idea on how to resolve this?
@cguccione the failures are related to flake8, try running flake8 metapool
from the base directory. If that doesn't yield any insights try chatting with @callaband or @ahdilmore they can probably help sort this out.
These are the failures that I'm seeing out of flake8:
metapool/metapool.py:49:1: W293 blank line contains whitespace
metapool/tests/test_metapool.py:137:1: W293 blank line contains whitespace
metapool/tests/test_metapool.py:139:13: F841 local variable 'obs_plate_df' is assigned to but never used
metapool/tests/test_metapool.py:140:1: W293 blank line contains whitespace
Can you also update the amplicon-pooling notebook and any other notebooks where there's a check for repeated sample names?
@ElDeveloper I was able to fix the errors using flake8. I also made changes to amplicon-pooling, metagenomcis_pipeline_MiniPiconorm, metagenomics_pipeline_qPCRnorm, and metageneomics_samplesheet_generator to be identical to the original fix in read_plate_map_csv so they should all handle repeated sample names the same correct way now.
Thanks so much @cguccione! This is very helpful and I think it looks ready to merge, however it looks like there's a lot of changes that came in with the notebooks, do you know what might be going on there?
Hi @ElDeveloper ! I've looked through my pull request a few times and I can't find any changes other than the ones I intended to add along with changes to the output of some of the cells in the jupyter notebooks (that seems to be where most of the changes are coming from). I'm also not sure why it appears that amplicon-pooling.ipynb notebook isn't appearing for you, because I can see it in my commit. I'm happy to set up a meeting with @ahdilmore and @kfogelso and we can figure out how to resolve this issue if that helps! Thanks for all your help.
Hi @cguccione, looks like I made a mistake. You didn't delete amplicon-pooling.ipynb
. Thanks for catching that. I think the only thing that should be updated is the fact that the notebooks currently have error messages in the committed version. For example compare cell 10 in this notebook from your fork with cell 12 in this notebook. As you can tell one of them as an exception's traceback and the other one has the "usual output". I think you should be able to rerun the notebook and as long as there are no errors all should be good to go. Would you mind making sure that the same is true for other notebooks (at least metagenomics_samplesheet_generator.ipynb
shows a similar problem). Please let me know if I can help.
Hi @ElDeveloper! I fixed some of the errors, but I'm having issues running the notebooks without any errors. I tried rebasing my fork in an attempt to resolve any errors, but that did not fix all the issues. These are the main two I am running into which are causing problems downstream in the notebooks and causing there to be major differences when pushing to git.
In the 'metagenomics_pipeline_MiniPiconorm' notebook I get the following errors in the Demux Summary step :
FileNotFoundError: [Errno 2] No such file or directory: './test_data/Demux/Stats.json'
In the 'metagenomics_pipeline_qPCRnorm' notebook I get the following error in the 'Step 2: format sample sheet data' section.
KeyError: "['Project Plate', 'Project Name'] not in index"
I am happy to meet with @ahdilmore or someone who has recently submitted a successful pull request in order to see how they accounted for these errors or what I may have changed to cause them. Thank you for all your help and I apologize this process is taking so long.
@cguccione @ElDeveloper I'm not able to replicate the first error, but I did replicate the second. I'm wondering if others are able to replicate such as @charles-cowart @antgonza
Thank you so much @ahdilmore ! I am also noticing that some of my packages are different versions than yours so I can try updating mine and rerunning the 'metagenomics_pipeline_MiniPiconorm' notebook to see if I can resolve the issue.
Just in general I’ve noticed a few things like unittests will break if I create a brand-new Conda environment using the instructions in the README. Python 3.10 came out recently, and if the new environment uses Python 3.10, some paths have been deprecated or moved, some package versions may change, etc.
On Dec 7, 2021, at 10:56 AM, Caitlin Guccione @.***> wrote:
@cguccione https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_cguccione&d=DwMFaQ&c=-35OiAkTchMrZOngvJPOeA&r=r9g1PxFQBBL7eQ2L8OAF0l29pckeyaar0r5s1gYYuws&m=VfsQlDRPs7Ges7PeLaGBM9jzBp0evljE4IGFUA-Noe6zBve7DxembKQCEroZBTx0&s=9ExwW6TpRtzPjKaGwSVQ21zDgq4OvfpSr9thY0olP_E&e= @ElDeveloper https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ElDeveloper&d=DwMFaQ&c=-35OiAkTchMrZOngvJPOeA&r=r9g1PxFQBBL7eQ2L8OAF0l29pckeyaar0r5s1gYYuws&m=VfsQlDRPs7Ges7PeLaGBM9jzBp0evljE4IGFUA-Noe6zBve7DxembKQCEroZBTx0&s=jLZ0xO-iXaYcI2A7aP2Zq5MHNG_JP_-SqeCy-qZLKWI&e= I'm not able to replicate the first error, but I did replicate the second. I'm wondering if others are able to replicate such as @charles-cowart https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_charles-2Dcowart&d=DwMFaQ&c=-35OiAkTchMrZOngvJPOeA&r=r9g1PxFQBBL7eQ2L8OAF0l29pckeyaar0r5s1gYYuws&m=VfsQlDRPs7Ges7PeLaGBM9jzBp0evljE4IGFUA-Noe6zBve7DxembKQCEroZBTx0&s=TiviN3QYjXU2P0XjrDnR-A3-oKu0Y6qgeoYF9MAlHJI&e= @antgonza https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_antgonza&d=DwMFaQ&c=-35OiAkTchMrZOngvJPOeA&r=r9g1PxFQBBL7eQ2L8OAF0l29pckeyaar0r5s1gYYuws&m=VfsQlDRPs7Ges7PeLaGBM9jzBp0evljE4IGFUA-Noe6zBve7DxembKQCEroZBTx0&s=42OkYboPzsZ3a0MxXWi1lpf0TQt59aTPj9NwGtpOP98&e= Thank you so much @ahdilmore https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ahdilmore&d=DwMFaQ&c=-35OiAkTchMrZOngvJPOeA&r=r9g1PxFQBBL7eQ2L8OAF0l29pckeyaar0r5s1gYYuws&m=VfsQlDRPs7Ges7PeLaGBM9jzBp0evljE4IGFUA-Noe6zBve7DxembKQCEroZBTx0&s=EyFhWdkDX7-gIt6BhsogwF82VY2-E5zhih_2h_8DZJM&e= ! I am also noticing that some of my packages are different versions than yours so I can try updating mine and rerunning the 'metagenomics_pipeline_MiniPiconorm' notebook to see if I can resolve the issue.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_biocore_metagenomics-5Fpooling-5Fnotebook_pull_39-23issuecomment-2D988186722&d=DwMFaQ&c=-35OiAkTchMrZOngvJPOeA&r=r9g1PxFQBBL7eQ2L8OAF0l29pckeyaar0r5s1gYYuws&m=VfsQlDRPs7Ges7PeLaGBM9jzBp0evljE4IGFUA-Noe6zBve7DxembKQCEroZBTx0&s=mp0T_gMJCG0wDoq1lvfAYbGVR30DUuioFTObY9fIS_Y&e=, or unsubscribe https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AKFU7EZYDUYZENM3LF2HOKTUPZKEFANCNFSM5E6LQSDQ&d=DwMFaQ&c=-35OiAkTchMrZOngvJPOeA&r=r9g1PxFQBBL7eQ2L8OAF0l29pckeyaar0r5s1gYYuws&m=VfsQlDRPs7Ges7PeLaGBM9jzBp0evljE4IGFUA-Noe6zBve7DxembKQCEroZBTx0&s=2cu_BY7EUrTQCnI4naV7KaL7qQipOytdtCdFmW4_H-A&e=. Triage notifications on the go with GitHub Mobile for iOS https://urldefense.proofpoint.com/v2/url?u=https-3A__apps.apple.com_app_apple-2Dstore_id1477376905-3Fct-3Dnotification-2Demail-26mt-3D8-26pt-3D524675&d=DwMFaQ&c=-35OiAkTchMrZOngvJPOeA&r=r9g1PxFQBBL7eQ2L8OAF0l29pckeyaar0r5s1gYYuws&m=VfsQlDRPs7Ges7PeLaGBM9jzBp0evljE4IGFUA-Noe6zBve7DxembKQCEroZBTx0&s=EJpbQKtd1fUMdNHm4OUTIiL8rd4Fi9RxbM0X85DPbNg&e= or Android https://urldefense.proofpoint.com/v2/url?u=https-3A__play.google.com_store_apps_details-3Fid-3Dcom.github.android-26referrer-3Dutm-5Fcampaign-253Dnotification-2Demail-2526utm-5Fmedium-253Demail-2526utm-5Fsource-253Dgithub&d=DwMFaQ&c=-35OiAkTchMrZOngvJPOeA&r=r9g1PxFQBBL7eQ2L8OAF0l29pckeyaar0r5s1gYYuws&m=VfsQlDRPs7Ges7PeLaGBM9jzBp0evljE4IGFUA-Noe6zBve7DxembKQCEroZBTx0&s=vlJ06lAE7ifDhRd2Quxar3Iw4yeFEDvFreBV4sDg4GE&e=.
Files are zipped cause GitHub.
Here's the stats.json file that you are looking for. I suspect not everyone has been running all of the modules in the notebook because they are not modifying the later portion of the notebook.
The second error arises because the input plate_map.tsv file currently in the test_data do not have the most up to date file structure. Old plate maps are missing the columns 'Project Plate' and 'Project Name'. Replacing that file with this one should fix the issue and clean up the repository.
The Stats.json.zip file that @RodolfoSalido provided solved the issue. I agree that I believe it had gotten dropped before because I can't see it on the biocore:master fork of the Github. I added it back in with my 'Incorporate updated files to resolve errors in notebooks' commit above.
The second issue was easily solved by replacing Finrisk 33-36_plate_map.zip with your updated file. I also added the updated file in the 'Incorporate updated files to resolve errors in notebooks' commit above.
I am still getting a
LinAlgError: singular matrix
error in both the 'metagenomics_pipeline_MiniPiconorm' and the 'metagenomics_pipeline_qPCRnorm' notebooks. This isn't a major issue because it still allows me to move through the notebook and outputs a graph, although it does through a long error message. I tried deleting and reinstalling my conda environment to update all my packages in pooling_nb and I still seem to be running into the problem. Does anyone have solutions or should I just push with this minor error?
Thank you so much for your help! @charles-cowart @ahdilmore @RodolfoSalido !
What you describe makes sense. I think the test_data/Demux/Stats.json directory got mistakenly dropped from the repository at one point. It should be there.
I agree with Rodolfo including the file in the repo (as you did) should be the way to go. This way we'll have the ability to fully run the notebooks without any hiccups. Can you send me a note when you add these changes then we can do another pass on the code and merge 👍
On Dec 7, 2021, at 3:13 40PM, RodolfoSalido @.***> wrote:
What you describe makes sense. I think the test_data/Demux/Stats.json directory got mistakenly dropped from the repository at one point. It should be there.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
@cguccione can you work with @charles-cowart to fix the merge conflcits for this PR?
Hi @ElDeveloper @charles-cowart I looked through the merge conflicts it is just the new output from the jupyter cells that resulted when Charles made some updates, and the minor changes I made to the code in my initial commit which deal with the code failing with repeated sample names. I don't think there are any areas where @charles-cowart and I both changed code differently. @kfogelso and I were planning on meeting with @antgonza later today so I am happy to walk through all the changes in that meeting or if @charles-cowart wants to meet I can do that as well, but I think we should be able to proceed without meeting and I can go through and resolve conflicts.
In regards to the changes that you asked me to make ''Can you send me a note when you add these changes then we can do another pass on the code and merge 👍'', I'm not sure what else you wanted me to change? I added in the Stats.json.zip and the Finrisk 33-36_plate_map.zip file in the previous commit but I'm happy to make more updates if needed, please let me know what you meant. I apologize for not replying to this comment last week prior to the other pull request, I was traveling.
Thank you all for the help!
Fixed #22 Updated read_plate_map_csv notebook to fail with repeated sample names. Removed repeated sample names check step in metagenomics_repooling.ipynb and caused user to not be able to continue with the notebook if there are repeated sample names. Added test case to ensure an error is thrown with repeat sample names in test_metapool.py.