gmarocena / gasv

Automatically exported from code.google.com/p/gasv
0 stars 0 forks source link

maximum # of chromosomes allowed? #3

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Hi, 
I have renamed my original chromosome names and input them with the 
-CHROMOSOME_NAMING option.

When I run the cluster part of the program, I get the following message: 

WARNING: Finished looping through all possible Chr combinations, but didn't 
make it through all input file(s)! The inputs file(s) are either not sorted 
properly or have additional data with chromosome number > 24!

I have ~800 unknown groups that haven't been put into the contigs for the 
chromosomes. Is it possible to incorporate them, or can GASV handle only 24 
chromosomes.

Thanks in advance!
S

Original issue reported on code.google.com by suzanne....@gmail.com on 24 Jan 2011 at 7:37

GoogleCodeExporter commented 8 years ago
There are two steps to using the GASV pipeline with a BAM file: 
(1) The BAM Preprocessor; which creates GASV input files
(2) GASV; which clusters and reports variants of each type. 

At this point, the default for both is human chromosomes (i.e., numerical 
chromosomes from 1 - 24 or 1 - 22, X and Y). If you use any other type of 
chromosome naming (i.e., non-human or more than 24 chromosomes) you need to 
specify this for both steps. 

(To be complete, I've included detailed instructions for both steps, but you 
likely can just skip to the GASV step.) 

(1) BAM Preprocessor:

When you run the BAM Preprocessor (BAM_preprocessor.pl), you need to specify 
the chromosome naming convention with the -CHROMOSOME_NAMING. (As you did 
above.) This will map each chromosome name to a number, as in the following:

Column 1 - Chromosome naming in BAM file, 
Column 2 - replacing chromosome number

Column 1        Column 2 
-------------------------
Ca21chr1        1
Ca21chr2        2
Ca21-mtDNA      9

**Note: Please check the output of the BAM preprocessor to make sure the 
chromosome names were successfully changed.

(2) GASV

If the maximum numerical value in your input file is larger than 24, you will 
also need to specify the number of chromosomes in the input file.

For example, if the file test.esp had chromosomes numbered from 1 - 100 you 
would run the following:

java -jar gasv.jar --cluster --numChrom 100 test.esp

Please let us know if this does not resolve your question.

Original comment by sora...@gmail.com on 24 Jan 2011 at 8:47

GoogleCodeExporter commented 8 years ago
Great! Thanks! Worked great!

Original comment by suzanne....@gmail.com on 25 Jan 2011 at 2:41