Open KovachLab opened 3 years ago
Chromosomes are required to be numbered, or be named "MT", "Y" or "X", and they can start with "chr" which gets ignored. This is required because I impose a natural order on chromosomes, which is 1-22,X,Y,MT, so
What are your chromosome names?
The chromosome names in the fasta reference are as follows (there are about 8000 scaffolds so I only included a few, their names are all of a similar format) :
LG01 LG02 LG03 LG04 LG05 LG06 LG07 LG08 LG09 LG10 LG11 LG12 LG13 LG14 LG15 LG16 LG17 LG18 LG19 LG20 LG21 LG22 LG23 MT_genome GmG20150304_scaffold_1408 GmG20150304_scaffold_1409 GmG20150304_scaffold_1410 . . .
OK, this is a problem! I understand this would be great to have, but currently I don't allow this. I strictly operate on a simple chromosomal scale as for humans. The reason is that pileupCaller needs to match up genomic positions (consisting of chromosome name and position in the chromosome) between the incoming pileup data and the provided SNP file. In order for this matching to work faithfully, I work with a strictly ordered list of chromosomes.
I need to think how to best fix this. Perhaps I allow the user to input a custom chromosome/scaffold order, or I simply assume that the chromosomal order is the same in the pileup data and the SNP file.
Sorry, this will take a bit time, and I'm sorry that pileupCaller currently doesn't support scaffolded genomes. Clearly a shortcoming, and I'll work on that.
Hi,
I'm attempting to run sequenceTools to generate pseudo-haploid calls for a list of bamfiles.
However I'm getting the following error: pileupCaller: SeqFormatException "cannot parse chromosome"
Any suggestions? Thanks!