EI-CoreBioinformatics / portcullis

Splice junction analysis and filtering from BAM files
https://ei-corebioinformatics.github.io/portcullis/
GNU General Public License v3.0
38 stars 9 forks source link

portcullis::JunctionError* #39

Closed lstxmu closed 5 years ago

lstxmu commented 5 years ago

when i run the filter progress, i encountered the following message, and the program failed, anyguy had the sam issue?

src/junction.cc(1161): Throw in function static std::shared_ptr portcullis::Junction::parse(const string&) Dynamic exception type: boost::exception_detail::clone_impl std::exception::what: std::exception [portcullis::JunctionError*] = Could not parse line due to incorrect number of columns. This is probably a version mismatch. Check file and portcullis versions. Expected 75 columns. Found 74. Line: 35370 71 NW_009258283.1 122381 32109 32353 245 32095 32478 ? ? ? CC N 0 1 0 1 1 1 0 1 0 0 0 0 0.0 0 0 0 1 0.0 3.0 140 14 14 0 10 9 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0

maplesond commented 5 years ago

The error message explains the issue, albeit a bit cryptically. The different versions of portcullis are not backwardly compatible, so the issue are seeing is that you've generated the junction file with one version of portcullis which produces a file with 74 columns, but are trying to filter with a different version, which expects 75 columns. So the options are either to revert to the version you used to generate the junction file, or to regenerate the junction file with the latest version.

shenweima commented 5 years ago

I also have this error. This is my command portcullis full -t 6 -o portcullis_result_part3 --exon_gff --source portcullis --use_csi portcullis/portcullis.genome.fa mergeOut_part3.bam. So,is the version same? 2019-04-20-024701

The arrow indicates a value is missing in line 730468. In the file 2-junc/portcullis_all.junctions.tab, I changed 'NA' to 'NN' in line 730468(see below png),then run command portcullis filt -t 4 -o filt --exon_gff --source portcullis 1-prep 2-junc/portcullis_all.junctions.tab, it worked.

2019-04-20-024702

Here is a bug?

maplesond commented 5 years ago

Hi @shenweima, yes in this case it looks like this is a bug. Although the dataset is quite odd as it looks like the donar and acceptor sites of the junction are at a place in the genome which is covered by uncalled bases (N). I therefore doubt any of those junctions are genuine, or at least there is not enough evidence to support them. So your workaround would be fine, as would removing the affected rows completely. I'll try to get a fix for this out when I next get some time.

cgjosephlee commented 5 years ago

I guess it was the NA string in 13,14th column causing the problem. I tried replacing all NA to NN in 2-junc/portcullis_all.junctions.tab and no more errors were popped (v1.2.0).