CFSAN-Biostatistics / shigatyper

CFSAN Shigella Typing Pipeline
Other
14 stars 6 forks source link

fix S. flexneri serotype prediction; update GHActions #15

Closed kapsakcj closed 1 year ago

kapsakcj commented 1 year ago

Here's my attempt to fix issue #11 regarding S. flexneri serotype prediction.

This PR definitely needs some review here as I'm not a python expert and I'm also not an expert in Shigella serotype prediction and the logic to do so.

I've tweaked the shigatyper/shigatyper.py which seems to have restored the ability to predict serotypes for S. flexneri. I adjusted the if/elif/else conditionals so that if wzx is equal to Sf_wzx (meaning the sample contains the flexneri wzx gene), then remove both Sf_wzx and Sf_wzy from Hits and continue on to S. flexneri serotype prediction.

The else statement now occurs for samples where the wzx hit does not match any known genes and the prediction is set as: prediction = "Unable to determine a serotype based on gene hits" This one may need to be changed prior to merging.

To show that these changes work as expected, I've made a few changes to the GitHub Actions workflow to test a number of different Shigella flexneri Serotypes. I pulled almost all of the SRR accessions from the Shigatyper paper Supplemental table S1, and thus should be good data to test with.

Changes to .github/workflows/test-shigatyper.yml:

Feedback of course welcome, hope this helps resolve the issue!

florathecat commented 1 year ago

Thanks Curtis,

This was how the script was designed to do. I appreciate you spending the time on fixing it.

Yun

From: Curtis Kapsak @.> Sent: Monday, May 15, 2023 2:08 PM To: CFSAN-Biostatistics/shigatyper @.> Cc: Subscribed @.***> Subject: [CFSAN-Biostatistics/shigatyper] fix S. flexneri serotype prediction; update GHActions (PR #15)

Here's my attempt to fix issue #11 https://github.com/CFSAN-Biostatistics/shigatyper/issues/11 regarding S. flexneri serotype prediction.

This PR definitely needs some review here as I'm not a python expert and I'm also not an expert in Shigella serotype prediction and the logic to do so.

I've tweaked the shigatyper/shigatyper.py which seems to have restored the ability to predict serotypes for S. flexneri. I adjusted the if/elif/else conditionals so that if wzx is equal to Sf_wzx (meaning the sample contains the flexneri wzx gene), then remove both Sf_wzx and Sf_wzy from Hits and continue on to S. flexneri serotype prediction.

The else statement now occurs for samples where the wzx hit does not match any known genes and the prediction is set as: prediction = "Unable to determine a serotype based on gene hits" This one may need to be changed prior to merging.

To show that these changes work as expected, I've made a few changes to the GitHub Actions workflow to test a number of different Shigella flexneri Serotypes. I pulled almost all of the SRR accessions from the Shigatyper paper Supplemental table S1, and thus should be good data to test with.

Changes to .github/workflows/test-shigatyper.yml:

Feedback of course welcome, hope this helps resolve the issue!


You can view, comment on, or merge this pull request online at:

https://github.com/CFSAN-Biostatistics/shigatyper/pull/15

Commit Summary

File Changes

(2 https://github.com/CFSAN-Biostatistics/shigatyper/pull/15/files files)

Patch Links:

— Reply to this email directly, view it on GitHub https://github.com/CFSAN-Biostatistics/shigatyper/pull/15 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4KBBKKY2HE3E3XREIA6S3XGJWGTANCNFSM6AAAAAAYCRN4F4 . You are receiving this because you are subscribed to this thread. https://github.com/notifications/beacon/AB4KBBI6JQTAWNJBUODK37LXGJWGTA5CNFSM6AAAAAAYCRN4F6WGG33NNVSW45C7OR4XAZNFJFZXG5LFVJRW63LNMVXHIX3JMTHGL5KJRU.gif Message ID: @. @.> >

kapsakcj commented 1 year ago

No problem! Happy to provide a small contribution here 😄

Looks like the GitHub actions workflow needs approval to run from a maintainer of this repo, but otherwise the last workflow iteration can be viewed over on my fork here: https://github.com/kapsakcj/shigatyper/actions/runs/4950141951