Closed J35P312 closed 2 years ago
The repeat mask files would be great both for 37 and 38
Do you have any server were I can upload it? Or is it ok to put it on KI box for now?
If it is not too big you can put it here: https://github.com/Clinical-Genomics/reference-files/tree/master/rare-disease/region via a pull request
Im interested in looking in to this at some point.
cat retroseq_refs.tab
Alu /home/daniel.nilsson/sandbox/reference-files/rare-disease/region/grch37_Alu.bed
L1 /home/daniel.nilsson/sandbox/reference-files/rare-disease/region/grch37_L1.bed
SVA /home/daniel.nilsson/sandbox/reference-files/rare-disease/region/grch37_SVA.bed
HERV /home/daniel.nilsson/sandbox/reference-files/rare-disease/region/grch37_HERV.bed
Status:
Should be a project and should not be done in MIP, but in https://github.com/nf-core/raredisease
Hello there!
I want to add mobile element detection using retroseq (https://github.com/tk2/RetroSeq). Retroseq produces a vcf file, and is run in two steps, discovery, and calling:
1:discovery perl retroseq.pl -discover -bam input.bam -refTEs repeatElement.tab -output output.bed
input.bam is an indexed bam file
repeatElement.tab is a tab separated list, specifying the name of repeats and the genomic position of those repeats. The genomic position is found using the repeat masker tool, I can give you such files on request!
output.bed -output bed file, this is the input of the calling process
2: calling
perl retroseq.pl -call -bam input.bam -input output.bed -ref ref.fasta -output output.vcf -soft
input.bam - indexed bam output.bed - the bed file produced by retroseq discover ref.fasta - reference fasta file, should match the bam file and repeatmasker file output.bed - final output bed
The output vcf should be frequency annotated, I use the following SVDB command:
svdb --query --db SweGen_RetroSeq.vcf --query_vcf retroseq.vcf --bnd_distance 200 --overlap -1
After frequency annotation, I perform gene annotation using VEP (same command as for SV).
I would remove/rank SV based on the FL Format column entry: all variants having an FL of less than 6 should be filtered.
Good luck, and feel free to ask if you have any questions!