itmat / rum

RNA-Seq Unified Mapper
http://cbil.upenn.edu/RUM
MIT License
26 stars 4 forks source link

multi-mapper in RUM_Unique... revisited #158

Open safisher opened 11 years ago

safisher commented 11 years ago

Running 2.0.3_04 and get the following error:

Tue Dec 18 16:39:21 2012 50223 WARN RUM::Script::CountReadsMapped - Looks like there's a multi-mapper in the RUM_Unique file. 20933270 () seq.20933270a chr2 51817713-5 1817757 + CAGCATTGGTAGCCACTAGTTAGACAAGAATGTTGTTCGAATTAA Tue Dec 18 16:39:21 2012 50223 WARN RUM::Script::CountReadsMapped - Looks like there's a multi-mapper in the RUM_Unique file. 20933270 () seq.20933270a chr2 51817713-5 1817757 + CAGCATTGGTAGCCACTAGTTAGACAAGAATGTTGTTCGAATTAA

I ran this alignment twice and got the same error both times. I'm saving the RUM directory for now, in case you're interested in having a look. It's on the Kim cluster in kimclust17:/local17/fisher/e.20/RN0015/rum.trim

The source files are in kimclust17:/local17/fisher/e.20/RN0015/trim.Ad.PolyAT

Thanks!

mdelaurentis commented 11 years ago

I'll take a look at it. I briefly looked at the log file, and it does appear to be the same issue as last time. I'll run a very small job that includes that read, with the --no-clean option, and see which step of the pipeline it first shows up as a duplicate.

On Tue, Dec 18, 2012 at 9:15 PM, safisher notifications@github.com wrote:

Running 2.0.3_04 and get the following error:

Tue Dec 18 16:39:21 2012 50223 WARN RUM::Script::CountReadsMapped - Looks like there's a multi-mapper in the RUM_Unique file. 20933270 () seq.20933270a chr2 51817713-5 1817757 + CAGCATTGGTAGCCACTAGTTAGACAAGAATGTTGTTCGAATTAA Tue Dec 18 16:39:21 2012 50223 WARN RUM::Script::CountReadsMapped - Looks like there's a multi-mapper in the RUM_Unique file. 20933270 () seq.20933270a chr2 51817713-5 1817757 + CAGCATTGGTAGCCACTAGTTAGACAAGAATGTTGTTCGAATTAA

I ran this alignment twice and got the same error both times. I'm saving the RUM directory for now, in case you're interested in having a look. It's on the Kim cluster in kimclust17:/local17/fisher/e.20/RN0015/rum.trim

The source files are in kimclust17:/local17/fisher/e.20/RN0015/trim.Ad.PolyAT

Thanks!

— Reply to this email directly or view it on GitHubhttps://github.com/PGFI/rum/issues/158.

mdelaurentis commented 11 years ago

I ran a very small job that includes that read, plus the hundred reads that appear before and after it in the input file. It wasn't duplicated in those results. It just appeared once in the RUM_Unique file. Now I'm running the full job, with the same number of chunks that you used (30). I'm thinking it may have had something to do with the read's position in the input.