nloyfer / wgbs_tools

tools for working with Bisulfite Sequencing data while preserving reads intrinsic dependencies
Other
125 stars 33 forks source link

bam2pat Invalid input #24

Closed rnbatra closed 1 year ago

rnbatra commented 1 year ago

We used bwamem to create a bam file, and then sambamba for duplication removal. Unfortunately, bam2pat does not work. Would you have any suggestions to move forward? Thanks!

(base) bash-4.2$ wgbstools bam2pat normal_PA12005_merged.mdup.bam --out_dir $OUTDIR
[wt bam2pat] bam: normal_PA12005_merged.mdup.bam
Invalid input argument
Failed
nloyfer commented 1 year ago

Hi, I quickly went through bam2pat.py and it seems like I forgot to print the informative error message in one place. For the case where the chromosomes in the bam files do not match the chromosomes in the reference fasta. So,

  1. I just fixed this, if you reclone wgbstools it should print a more informative error message for this case.
  2. You could check for yourself, how your chromosomes are called - in your bam file and in your reference file (fasta)
  3. If this isn't the problem, I suggest another way to narrow it down: run wgbstools bam2pat command with the --verbose flag. It will print the subprocess commands it tries to run, that are probably failing.

As for the solution, you should run wgbstools init_genome REF_NAME --fasta_path FASTA_PATH, to initialize wgbstools with the reference that matches your bam file.

If you are working with more than one reference fasta files, you can always switch between them with wgbstools set_default_ref --name NAME.

rnbatra commented 1 year ago

Hi, Thank you for the update. Yes, I did realise the chromosome naming issue. It works now!