walaj / svaba

Structural variation and indel detection by local assembly
GNU General Public License v3.0
230 stars 44 forks source link

Error: Found chromosome in region file not in reference genome. Skipping #138

Open shguturu opened 3 months ago

shguturu commented 3 months ago

Hello! I am trying to run svaba for targeted SV detection.

The command works with Svaba v1.1.0 but with v1.2.0 it throws the following error:

"Found chromosome in region file /rtsess01/juno/home/arorak/resources/ACCESSv2_targets_coverage.100bp_padding.bed not in reference genome. Skipping.
   Caught error: regex_error
ERROR: Cannot read region file: /rtsess01/juno/home/arorak/resources/ACCESSv2_targets_coverage.100bp_padding.bed or something wrong with bam header ('chr' prefix mismatch?)"

I am using the same region file that was working with Svaba v1.1.0 so am unsure why it is not working with the new version. Is there a change that needs to be made to the region file? Thanks in advance!

Here is the command for reference: "svaba run -G $REF_GENOME.fa -t $TUMOR.bam -n $NORMAL.bam -k $targets.bed -C 5000 -p 4 -a svaba_trial"

walaj commented 3 months ago

The error comment I think could be better worded. What it is catching is an attempt to find a chromosome string from the BED file in the BAM header. Can you confirm that there are no chromosome names in the *bed file that are not in the tumor bam header? For this, it actually doesn't check the reference fasta, since that is done elsewhere.

If still an issue, I can try to recreate if you provide the BAM header (as a txt file aka SAM file) and the targets bed file. Dropbox link is fine. Should be no actual data in those files.

walaj commented 3 months ago

I'm not able to re-create, but you could try adding this to the try block of svabaUtils.cpp right after file_regions = SeqLib::GRC(region_file, h);

for (const auto& a : file_regions) { std::cerr << a.ToString(h) << std::endl; assert(a.pos2 >= a.pos1); } std::cerr << h.AsString() << std::endl;

And then recompiling and rerunning and inspecting the output

shguturu commented 3 months ago

I’m currently using svaba through conda — I’m running into a separate error with htslib when I try to build svaba from the install (even after specifically including the htslib path in CPATH). Do you have any ideas about what might have changed between version 1.2.6 and the previous version that could be causing the problem? The same command and files were working on the older version

-Shivani

On Tue, May 28, 2024 at 12:14 PM Jeremiah Wala @.***> wrote:

I'm not able to re-create, but you could try adding this to the try block of svabaUtils.cpp right after file_regions = SeqLib::GRC(region_file, h);

for (const auto& a : file_regions) { std::cerr << a.ToString(h) << std::endl; assert(a.pos2 >= a.pos1); } std::cerr << h.AsString() << std::endl;

And then recompiling and rerunning and inspecting the output

— Reply to this email directly, view it on GitHub https://github.com/walaj/svaba/issues/138#issuecomment-2135635804, or unsubscribe https://github.com/notifications/unsubscribe-auth/BDXBZ2IRNSVQCFOAEBQQWMDZESUOBAVCNFSM6AAAAABIHZWGBOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZVGYZTKOBQGQ . You are receiving this because you authored the thread.Message ID: @.***>

kanika-arora commented 3 months ago

@walaj I modified the svabaUtils.cpp file like you suggested, but it doesn't change anything. It doesn't write any other output and simply gives the error message that Shivani mentioned previously. Moreover, if instead of providing a bed file, I specify a chromosome, for example -k 1, then I get the following message:

terminate called after throwing an instance of 'std::regex_error'
  what():  regex_error
Aborted (core dumped)

And as mentioned in the previous comments, these exact same commands seem to work and produce results when run on SvABA v1.1.0 installed from Bioconda.

kanika-arora commented 3 months ago

Hi @walaj, do you have any other suggestions for debugging this issue? I tried running the tool without the -k option as well. That runs for a little bit, generates some intermediate files but eventually throws a different error and fails.