Closed chaizuolin closed 6 years ago
Chromosom 13,14,15 and 21,22 are acrocentric . That means there are no DNA content in the short arm. centromeric region are also gray without dna content.
What I mean is. I created the reference with 1000 samples. Why are all chromosomes Unmappable region. but ,I created the reference with 700 samples ,come out Created the reference is the more samples the better.How many samples are you recommended to create
Hi chaizuolin,
I vaguely remember seeing such a plot before. If I remember correctly the reference in this case was build using the raw .pickle files instead of the GC-corrected ones (.gcc). Normally the flow of building a WISECONDOR reference is like this (also described in the wiki):
samtools rmdup -s sample.bam - | samtools view - -q 1 | python consam.py -outfile sample.pickle
python ./gcc.py ./in/sample.pickle ./ref/gccount ./in/sample.gcc
python newref.py ./in/refs/ ./ref/reference
Could you double check if you indeed performed step 2?
Good, Thank you very much. How many samples do you recommend to create the reference and how much data size is in each sample.
Hi chaizuolin,
Did my previous comment solve your issue with the reference set of 1000 samples?
Regarding your follow up question, please see the wiki, sections What do I put in for reference data? and How many reads do I need for analysis?.
It seems to me that you are using the legacy version of WISECONDOR, did you also try the newest version (master)? The new version should allow you to look more closely at the signal of the samples (we've seen a higher resolution when using the new version). Because of your large number of samples, you should definitely see an improvement. Running WISECONDOR is slightly different when using the new version, please let me know if you need additional help.
Hello. I see "How many reads do I need for analysis?" inside the wiki. But I can't find " How many reads do I need for analysis?" The RETRO filter parameter mentioned in this article
Hi chaizuolin,
You can find and set the parameters for the RETRO filter in the consam.py file: -retdist and -retthres. Default value of both parameters is 4. In the Supplementary data of the WISECONDOR paper you can read up on the way the RETRO filter works (and find details about the parameters).
Did the information in this issue help solve your problem with the gray regions?
Hello ,I created the reference with 1000 samples. Why is it all gray regions.