Open gabrieljaykay opened 2 years ago
Hello! Sorry for the inconvenience. Could you check the presence of the chr10_KN196480v1_fix in the file STAR/chrNameLength.txt? If it's not present, that might be the issue. The chrNameLenght.txt is created by STAR and is used in one of the reference build steps and it should contain it if it was in the genome.fa file. Is it possible that your genome.fa file contains only the canonical chromosome and not the fix patch? If so, you can replace the genome.fa with the full one or filter the transcripts.gtf to keep only the canonical ones. With awk should be something like:
cp ./transcripts.gtf ./transcripts_bkup.gtf
awk ' $1 ~ /^chr[0-9XY]+$/ {print } ' ./transcripts_bkup.gtf > ./transcripts.gtf
Let me know if this helped. Cheers, Claudio
Hello,
I've been trying to build an IRFinder reference to look for intron retention in RNA-seq data stored in BAM files that are in the ucsc format, so I need to build a reference using the UCSC genome sequence instead of Ensembl. I'm running the BuildRefProcess while including the fasta and gtf files from the UCSC in the same folder and naming them 'genome.fa' and 'transcripts.gtf' respectively. I keep getting this error, and am uncertain as to why it would be running into a specific issue for this chromosome. Any help you could provide would be greatly appreciated.