Closed erikwolfsohn closed 2 months ago
Hey @erikwolfsohn,
Thanks for all the changes you've contributed! I've incorporated them into the updates I was already working on for the v1.2.1 update and have also expanded upon some of the additions you made as well.
I expanded the try/catch for file permission errors to cover all files being generated by the file_handler.py script. I really liked your changes for GISAID but I made a couple of modifications to eliminate the database prefix issues and changed how the logging occurs as I noticed there could be an error where the incorrect sample name might be used. The bs-description field should be resolved as I also changed how the config file works for the description title and comment. I've split and renamed the "bs-description" field into "bs-sample_title" and "bs-sample_description" with updated documentation in the templates so that should provide a better explanation of those fields. I also provided some additional info in the shiny issue you had created as well.
Thanks, again, -Dakota
Edit: added some BioSample/SRA workflow changes and attached my config.yaml and metadata files in case you wanted to test against them. example_config.yaml.txt example_metadata.csv
Hey Dakota! Thanks for getting this new release out. The new handling for BioSample packages is amazing. I hugely appreciate how much this project simplifies metadata validation across multiple pathogens/packages/repositories and how convenient it makes large volume submissions.
I've tested the BioSample/SRA, and GISAID covCLI workflows - the NCBI workflows worked perfectly, but I ran into a bunch of submission failures testing covCLI.I made some modifications and now my GISAID submissions are going through reliably. I haven't had time to do any serious testing so I can't say if all these changes will hold up, but I wanted to go ahead and submit a pull request in case there's anything that might be helpful.The former description of this pull request is a little out of date now. I've been testing GISAID covCLI, SRA, and BioSample heavily, using the SARS-CoV-2 and OneHealth Enteric BioSample packages. Below are changes addressing workflow errors/submission failures I encountered during testing. I want to revisit a few of the changes I made, but hopefully some of them are useful!
⚙️ General
🛠️ covCLI updates/bugfixes
📋 BioSample & SRA updates/bugfixes
organism
,bs-host
,bs-host_disease
.bs-description
isn't in the Submission Wizard template and is not required by NCBI for submission, but the workflow will fail when it isn't present. Edit: I want to revisit this, because it doesn't seem like missing bs-description causes a crash w/every BioSample package, and also this is kind of a bandaid anyway.Link_Sample_Between_NCBI_Databases
disabled. SRA submission is likely to fail in this situation unless the user has added BioSample accessions manually beforehand.Testing data was generated via:
Metadata and config templates were created with the Shiny app Submission Wizard
And the workflow was run with this command: