Closed gatoravi closed 10 years ago
The problem seems to be here: 2014-01-23 18:24:25-0600 GGM: Error in read.table(file = filename, skip = skip, sep = sep) : 2014-01-23 18:24:25-0600 GGM: no lines available in input 2014-01-23 18:24:25-0600 GGM: Calls: varscan.load_snp_output -> read.table
/opt/gms/AUIS907/fs/AUIS907/info/model_data/9139d1d4d0d6481db98962e328c724ee/build3b25aebceea541348337ffc207f37e21/TST1/clonality/clonality.R tries to call varscan.load_snp_output from /opt/gms/AUIS907/sw/genome/lib/perl/Genome/Model/Tools/Validation/VarScanGraphLib.R
varscan.load_snp_output("/tmp/1188.tmpdir/gm-genome_sys-2014-01-23_18_24_21--7pXX/anonymous4",header=F)->xcopy;
varscan.load_snp_output("/tmp/1188.tmpdir/gm-genome_sys-2014-01-23_18_24_21--7pXX/anonymous4",header=F,min_tumor_depth=20,min_normal_depth=20)->xcopy100;
This is initiated by /opt/gms/AUIS907/sw/genome/lib/perl/Genome/Model/Tools/Validation/ClonalityPlot.pm which creates the R script:
varscan.load_snp_output(\"$temp_path\",header=F)->xcopy;
varscan.load_snp_output(\"$temp_path\",header=F,min_tumor_depth=$readcount_cutoff,min_normal_depth=$readcount_cutoff)->xcopy100;
$temp_path is defined as follows: $temp_path = $self->varscan_file_with_cn; $tfh = FileHandle->new($temp_path,"w"); OR ($tfh,$temp_path) = Genome::Sys->create_temp_file; The latter seems to be the case since I can't find evidence of the varscan_file_with_cn option every being specified and the R output contains what looks like a temp path.
Thus, the apparently missing temp file is created with the following block of code
my %copynumber_hash_tumor;
if(defined($cnvhmm_file)){
#build the copy number hashes
%copynumber_hash_tumor=%{&build_hash($cnvhmm_file)};
} elsif(defined($cbs_file)){
#build the copy number hashes
%copynumber_hash_tumor=%{&build_hash_cbs($cbs_file)};
}
#read in the varscan input
my $varscan_input = new FileHandle ($varscan_file);
while (my $line2 = <$varscan_input>) {
chomp($line2);
my ($chr, $pos, $ref, $var,
$normal_ref, $normal_var, $normal_var_pct, $normal_IUB,
$tumor_ref, $tumor_var, $tumor_var_pct, $tumor_IUB,
$varscan_call, $germline_pvalue, $somatic_pvalue, @otherstuff) = split(/\t/, $line2);
my $varscan_cn_tumor=&get_cn($chr,$pos,$pos,\%copynumber_hash_tumor);
print $tfh "$line2\t$varscan_cn_tumor\n";
if ($positions_highlight && -s $positions_highlight) {
my $matcher = "$chr\t$pos";
if (defined $position_highlight_hash{$matcher}) {
$position_added--;
my $depth = $tumor_ref + $tumor_var;
my $varallelefreq = $tumor_var_pct;
$varallelefreq =~ s/%//;
print $tfh2 "$varscan_cn_tumor\t$varallelefreq\t$depth\n";
}
}
}
$tfh->close;
$tfh2->close;
$varscan_file or $filtered_file is specified as the varscan_file option when Genome::Model::Tools::Validation::ClonalityPlot is called.It looks that is called from Command/GenerateClonalityPlots.pm. Working backwards:
my $filtered_file = $varscan_file . ".filt"
my $varscan_file = $readcounts_varscan_file;
my $readcounts_varscan_file = "$readcounts_outfile".".varscan";
my $readcounts_outfile;
if ($self->read_counts) {
$readcounts_outfile = $self->read_counts;
}
else {
$readcounts_outfile = "$adapted_file".".readcounts";
It looks like $adapted_file is being used because the read_counts option is not being specified to Genome::Model::ClinSeq::Command::GenerateClonalityPlots when called in ClinSeq.pm Therefore the missing file is possibly: my $adapted_file ="$output_dir"."allsnvs.hq.novel.tier123.v2.bed.adapted"; where $output_dir is the $clonality_dir as specified in ClinSeq.pm
Therefore, the missing file should be: /opt/gms/AUIS907/fs/AUIS907/info/model_data/9139d1d4d0d6481db98962e328c724ee/build3b25aebceea541348337ffc207f37e21/TST1/clonality/ allsnvs.hq.novel.tier123.v2.bed.adapted.readcounts.varscan or allsnvs.hq.novel.tier123.v2.bed.adapted.readcounts.varscan.filt
Indeed, the latter is empty.
Data are filtered from allsnvs.hq.novel.tier123.v2.bed.adapted.readcounts.varscan to produce allsnvs.hq.novel.tier123.v2.bed.adapted.readcounts.varscan.filt as follows:
next if ($tumor_cov < $self->min_tumor_cov || $tumor_cov > $self->max_tumor_cov || $normal_vaf > $self->max_normal_vaf);
There are only 4 variants that ended up in the unfiltered file. It is quite possible that none meet these criteria (hard to tell immediately without headers in the file). Should we add some logic somewhere to prevent an attempt at clonality plot generation when this file is empty? This situation is highly unlikely to occur except when using the small demo data.
Perhaps we can just add some logic to only call Genome::Model::Tools::Validation::ClonalityPlot->create() when varscan_file is non-zero?
This seems to have worked. See commits: https://github.com/genome/gms-core/commit/9e976f8028cec3f2e04eb86abd6bd105cb71cb30 and https://github.com/genome/gms-core/commit/34b1a14c641c5da0c41a99d96612819654ee0d33
The clonality analysis step within ClinSeq fails on Obi's box with this error,
2014-01-21 23:27:27-0600 GGM: ERROR RUNNING COMMAND. Exit code 1 from: R --vanilla --slave < /opt/gms/AUIS907/fs/AUIS907/info/model_data/9139d1d4d0d6481db98962e328c724ee/builde0f110f79090452eb2ba5d4fc937e899/TST1/clonality/clonality.R
Looks like the R script is trying to access a file that does not exist. This might be somehow related to the previous errors.