phac-nml / biohansel

Rapidly subtype microbial genomes using single-nucleotide variant (SNV) subtyping schemes
Apache License 2.0
25 stars 7 forks source link

Add #N/A in subtype field of results when no subtype is found #116

Closed glabbe closed 4 years ago

glabbe commented 4 years ago

This change completes the partial fix from PR #81 , which only handled cases where there were no kmers found. This addition now also handles cases where only negative kmers are found, and no subtype is found (Issue #112 ).

glabbe commented 4 years ago

Deletion of this line causes a Travis error https://github.com/phac-nml/biohansel/blob/e1139898ed3dcd5e5c3931e79d3300a073e1b967/bio_hansel/metadata.py#L57

Will try again

peterk87 commented 4 years ago

Hi @glabbe I think it's failing because the test subtyping-results.tsv file needs to be updated with empty subtype values replaced with #N/A.

Additionally, a row could be added to the test subtype-metadata.tsv file with #N/A subtype value and some arbitrary information in the other fields.

The test might need to be updated since it's very basic:

https://github.com/phac-nml/biohansel/blob/e1139898ed3dcd5e5c3931e79d3300a073e1b967/tests/test_metadata.py#L17-L21

Might be a better idea to test that where the subtype value is present in both the results and metadata tables that the metadata fields have been merged in as expected. If there is the results have a subtype value not present in the metadata table, then the fields from the metadata table should be empty (assert np.all(np.isnan(row[metadata_fields])))

glabbe commented 4 years ago

Thanks for the suggestion @peterk87 ! Will test these changes locally and create a new pull request.