AMI run returns no results for 200k reads containing viruses

chiulab / surpi

SURPI

chiulab.ucsf.edu/surpi

Other

82 stars 47 forks source link

AMI run returns no results for 200k reads containing viruses #15

Closed bede closed 9 years ago

bede commented 9 years ago

I ran SURPI to test it on a small sample comprising 200k already prefiltered Illumina reads (93MB fastq). I know there are viruses in there... Kraken spots perfect 31mer matches for a handful of RefSeq viruses. I was hoping to get some taxonomic assignments, but SURPI doesn't seem to have generated any useful information in its output, with most of the output files being empty.

I deployed the AMI to a m4.2xlarge instance and rsynced the reads across and then ran SURPI.sh in the default comprehensive mode. The fastq is 60MB after cutadapt, which seems reasonable. I get a SAM file from SNAP, but don't get any taxonomic information, nor a de novo assembly.

Can anyone suggest what might be going wrong here? I'd love to be able to use and cite SURPI.

Here is a screenshot of the output file listing: https://www.dropbox.com/s/1a80v3raf03nem9/Screen%20Shot%202015-08-10%20at%2012.03.44.png?dl=0

sfederman commented 9 years ago

Hi Bede,

Thank you for your comment, and screenshot - and apologies for the delay.

It looks like all of your files within the output folder are of 0 length, indicating that the run failed at some early point. If you still have the SURPI files, can you attach or send me the 2 files below:

SURPI.1.inc_orphans.clean.dedup.fq.err SURPI.1.inc_orphans.clean.dedup.fq.log

Along with the first few reads from your file:

1.inc_orphans.clean.dedup.fq

Thanks - I'll try to troubleshoot with these files.

-Scot

voutcn commented 9 years ago

@sfederman SURPI also failed to generate counttables and in our test. You can find the log & err files here: https://www.dropbox.com/sh/xi1j5z292mqhiw9/AADlFB0GlM0BGLnrAntISeG2a?dl=0

sfederman commented 9 years ago

@voutcn

You're also likely getting empty coverage plots due to your installed version of matplotlib, which looks to be at least v1.3.1. This issue should be able to be fixed by using the latest version of coveragePlot.py from the repository. (matplotlib v1.3.1 deprecated the load command).

As for the empty countable, it's hard to say what is happening. It looks like SURPI ran properly for you at least through the alignment to NT.

Also - would you mind starting a new thread with this topic? The original thread is regarding issues on the AMI, which you do not appear to be using here. I'll move my response if you can do so.

Thanks,

-Scot

bede commented 9 years ago

Hi @sfederman, thanks for your reply. I'm afraid this was time-sensitive, and after receiving no reply for a month either on here or from the corresponding author, I gave up and deleted the run. I suppose that was silly of me.

I gather others have reported similar issues with the AMI not returning results for their reads. I may look to SURPI in the future but can't justify trying it again for now.

Thanks again, Bede