ktmeaton / ncov-recombinant

Reproducible workflow for SARS-CoV-2 recombinant sequence detection.
MIT License
18 stars 2 forks source link

Handle numeric strain names, int object has no attribute split #210

Closed ktmeaton closed 1 year ago

ktmeaton commented 1 year ago

If a strain name is purely numeric (all numbers), the pipeline will crash at rule sc2rf_recombinants. This is because sc2rf_recombinants is the first rule where the metadata is parsed by python, which will automatically create numeric strain names to integers.

When metadata is imported into a pandas dataframe, the "strain" column must be cast to string.