nickjcroucher / gubbins

Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins
http://nickjcroucher.github.io/gubbins/
GNU General Public License v2.0
175 stars 51 forks source link

mask_gubbins_aln.py not removing recombinant regions #385

Closed erinpnewcomer closed 12 months ago

erinpnewcomer commented 1 year ago

Hi!

I'm trying to create a masked core genome alignment so I can see what % of the core is getting masked by Gubbins, but the script mask_gubbins_aln.py doesn't seem to be making any changes to the input core.aln. The recombination_predictions.gff file has 1501 lines of predictions. Has anyone else encountered this/any ideas on how to fix this?

nickjcroucher commented 1 year ago

Are there any unusual characters in your isolate names (e.g. "#"?)

sylarKYG commented 10 months ago

Are there any unusual characters in your isolate names (e.g. "#"?)

Same issue with Erin. The mask aln is as same large as input aln. The isolate name has "." and "_".

nickjcroucher commented 10 months ago

What version are you using, and what is the command you are running?

sylarKYG commented 10 months ago

Python: 3.11.7 Biopython: 1.82 The command: python3.11 /data4/CLC_data4/shiqiucheng/software/mask_gubbins_aln.py --aln clean.full.aln --gff clean.full.recombination_predictions.gff --out clean.full.mask.aln recombination_predictions.gff file has 1457 lines of predictions.

nickjcroucher commented 10 months ago

Thanks - what version of Gubbins?

sylarKYG commented 10 months ago

gubbins 2.4.1

nickjcroucher commented 10 months ago

Try upgrading to the latest version (3.3.2)

sylarKYG commented 10 months ago

How does the recombination removal fragment show in the alignment file, replace by "_" or "N"? Because byte count of clean.full.aln and clean.mask.aln are identical, but the counts of ATCG are different.

nickjcroucher commented 10 months ago

Whatever you set the missing character to - by default it is - - this is documented in the script's help.