format for regions file?

RichardCorbett commented 9 years ago

Hi, I can get useful results using this command: repeatseq A47294.bam GRCh37-lite.fa hg19.2014.noChr.regions

But when I try and make my own regions file using a subset of the lines in hg19.2014.noChr.regions, I only get a report in the .vcf for one of the regions I specified. I'm trying to match the same sort order, but I'm not having much luck getting results beyond a region of two from my list. Any ideas?

oiiio commented 9 years ago

What did you use to take your subset? I don't think that the order matters, but something may have happened to your delimiters... was it like a simple "head -100 originial.regions > subset.regions" ?

meganamsu commented 7 years ago

Does repeatseq use the second column information from the regions file? I have a small number of targets I would like to genotype not derived from TRF.

I made a file with regions defined in the first column, and then "random" in the second column. I received this error: improper second column found for chr1:22143098-22143237.
I then replaced "random" with "2.3_3_100_0_14_0_57_42_0_0.99_GCC" and reran. No errors, but empty vcf file. Do I need to generate a TRF file and intersect my regions with this file?
Notably, my regions are my sequencing target regions and not the precise locations of the STR. Perhaps that is the problem..

Also, I am assuming the 'fasta' input is the reference. Does it need to be indexed, etc?

Best, Megan

adaptivegenome / repeatseq

format for regions file? #2