Closed maolingfengZJU closed 4 years ago
I have the same problem, I specified my reference genome (also GRCh38) using --samtools_T, but I got the same error.
Could you post the output of samtools idxstats
for your BAM?
Yes, I attached the idxstats for my BAM here. I created a graph myself (and placed it under the knownReferences directory) but I got an error like this when running HLA*LA:
Graph directory ../graphs/PRG_MHC_GRCh38_withIMGT/knownReferences does not seem to be complete - does this directory specify a valid graph for HLA-LA? at src/HLA-LA.pl line 203.
As a background, I included HLA sequences (alternative contigs) in my reference genome, and so the BAM have those too.
I would really appreciate it if you could help me create a graph, and point out what is wrong with the one I created. Thanks a lot!
Hi @miko-798, if you say you created a graph yourself, what exactly do you mean by that? I assume you mean you have a custom reference file for your BAMs? The reference file you attached looks OK. What value do you use for the --graph
parameter?
Hi @AlexanderDilthey, thanks for your reply. The file "ERCC_HLA_graph.txt" I attached earlier is the graph I created. I did have a custom reference file in fasta format when I generated the BAMs (GRCh38.p12.genome.plus.ERCC.HLA.fa, which include extra ERCC contigs, as well as HLA sequences as alternative loci). I am pretty sure the graph I created contains all the contigs in the BAM.
For --graph
, used the same directory as before:
--graph PRG_MHC_GRCh38_withIMGT
,
and I tried putting the file "ERCC_HLA_graph.txt" under the same directory, as well as under knownReferences
directory. But I got the error as I wrote earlier.
Should I index the graph (I downloaded another copy of the data package, copied my graph there and tried indexing, but got another error, screenshot below)? Or what is a good way to solve this?
Thanks a lot for your help.
Hi @miko-798, OK - I think I understand what's going on! What you need is not a new graph, but merely a new reference extraction file. Here is what should work:
graphs/PRG_MHC_GRCh38_withIMGT
to its original state, e.g. by re-downloading the data package and indexing the graph.src/additionalReferences/PRG_MHC_GRCh38_withIMGT
.Hi @AlexanderDilthey,
Thanks a lot. It worked!
Actually I still have a question about the graph. I wonder how do I know which graph the tool is using. Can I get that information from the log file? After I put the file I created under the directory you specified src/additionalReferences/PRG_MHC_GRCh38_withIMGT
, and run HLA*LA, I got this in the logs:
Graph serialization existing and newer than graph file; read from /home/mikoliu798/HLA-LA/src/../graphs/PRG_MHC_GRCh38_withIMGT/serializedGRAPH
.
How do I make sure the tool is actually using the graph I created? I also attached the complete log here. 4084B_modified_graph.log
Thanks so much for your help!
There is currently only one graph, PRG_MHC_GRCh38_withIMGT
. I think you refer to the reference extraction file, right? The tool will complain in case it finds no suitable file or more than one; i.e. if it produces output, you can be certain it used your file.
Yes, I attached the idxstats for my BAM here. I created a graph myself (and placed it under the knownReferences directory) but I got an error like this when running HLA*LA:
Graph directory ../graphs/PRG_MHC_GRCh38_withIMGT/knownReferences does not seem to be complete - does this directory specify a valid graph for HLA-LA? at src/HLA-LA.pl line 203.
As a background, I included HLA sequences (alternative contigs) in my reference genome, and so the BAM have those too.
I would really appreciate it if you could help me create a graph, and point out what is wrong with the one I created. Thanks a lot!
hello, how to use specific IMGT version to build MHC_GRCh38_withIMGT ??
I downloaded the reference GRCh38 from Ensemble and build the index with bowtie2, but the problem happened like below, anyone can give me some advice?
Have found no compatible reference specifications in /public/home/fanlj/mlf/HLA/HLA-LA/src/../graphs/PRG_MHC_GRCh38_withIMGT/knownReferences - create a file for this BAM file and try again. at ../../HLA-LA/src/HLA-LA.pl line 315