Closed hchetia closed 2 years ago
Hi @hchetia, I've traced back this issue to potentially being an issue at Figshare's end. I've contacted them, and will let you know when I have a fix.
Do you recommend going ahead with my actual runs rather than to wait out the test dataset's availability?
The error was caused by missing reference genome files. So they will be needed for a regular run.
You can download individual missing files from here in the meantime:
hg19 https://figshare.com/articles/dataset/STRetch_reference_data_-_hg19/4658701/1
hg38 https://figshare.com/articles/dataset/STRetch_reference_data_-_hg38/5844396
FigShare wasn't getting back to me about this error, so I've moved the data. Would you mind updated with git pull
and run the ./install.sh
again. Then let me know if the test data works.
Hi @hdashnow I already ran my data using the files you shared above. Worked fine for me. Thanks.
Great!
Hi @hdashnow Thanks for STRETCH. Love the concept of decoy chromosomes. Do you happen to have a reference hg38 fasta file with the repeats introduced within the genes and their corresponding updated annotations? Adding 2000 trinucleotide repeats would add 6000 residues to the downstream annotation values right? Any gtf or gff3 format would work. This genome would be really helpful in visualizing the reads under IGV.
Regards, Hasna
I don't have anything like that. I do have some code for generating fasta files with different STR alleles. I used it for simulating reads. But you could potentially use similar logic to create an alternate reference. https://github.com/quinlan-lab/STRling/blob/master/sim/random_str_alleles.py I think that visualizing in IGV will still be challenging, because of the anchored reads. When you look at the reads aligned to the STRetch genome + decoy, these anchored reads should show up in a different colour because they align to a different chromosome (the decoy). This can help with visualization.
Awesome. Thanks, will try and get back to you.
Hi, I am trying to test run STRetch using the dataset at https://ndownloader.figshare.com/articles/4762489?private_link=cc7347f4637d9a7fe22d and running into the foll. error. (PFA). Basically, the tool looks for a file "hg19.STRdecoys.sorted.fasta.sa" which is not a part of the test dataset.
stretch_error.txt