itmat / rum

RNA-Seq Unified Mapper
http://cbil.upenn.edu/RUM
MIT License
26 stars 4 forks source link

Can't Generate plan for postprocessing. #175

Open gotte opened 11 years ago

gotte commented 11 years ago

Hi. I'm running RUM with hg18 for human ChIP-seq data. It gets through the preprocessing and processing steps, but it can't generate post-processing steps. It could be something as simple as incorrect settings but it doesn't appear to be. Any help would be appreciated. I've attached the settings below:

$VAR1 = bless( { 'version' => 'v2.0.5_03', '_default' => bless( { 'dna' => 1, 'chunks' => '8', 'forward_reads' => 'FGC0418_s_3.fastq', 'version' => 'v2.0.5_03', 'name' => 'K9_CTRL_OIS', 'index_dir' => '/gpfs/fs121/h/gotte/rum-indexes/hg18', '_default' => bless( { 'bowtie_nu_limit' => 100, 'blat_rep_match' => 256, 'blat_tile_size' => 12, 'max_insertions' => 1, 'blat_max_intron' => 500000, 'blat_min_identity' => 93, 'platform' => 'Local', 'blat_step_size' => 6 }, 'RUM::Config' ), 'platform' => 'SGE', 'output_dir' => '/gpfs/fs121/h/gotte/data/Brian_RASIMR90_05232013/rum/CTRL' }, 'RUM::Config' ), 'read_length' => 50 }, 'RUM::Config' );

The rum_postprocessing file says it can't generate a plan. Same goes for when I run rum_runner status.

2013/06/01 11:10:00 node03.local 13430 INFO RUM.Workflow Starting workflow 'Postprocessing'. I will clean up intermediate temporary files along the way. 2013/06/01 11:10:00 node03.local 13430 FATAL RUM.Death No plan at /gpfs/fs121/h/gotte/rum2/bin/../lib/RUM/Workflow.pm line 518

Thanks

mdelaurentis commented 11 years ago

Thanks for the issue report. I can take a look at it either tomorrow or more likely first thing Monday. In the meantime, if you can do an "ls -l" in the output directory and send me the results, that would be very helpful. Would you mind also sending the contents of the log directory?

Thanks,

Mike

On Saturday, June 1, 2013, gotte wrote:

Hi. I'm running RUM with hg18 for human ChIP-seq data. It gets through the preprocessing and processing steps, but it can't generate post-processing steps. It could be something as simple as incorrect settings but it doesn't appear to be. Any help would be appreciated. I've attached the settings below:

$VAR1 = bless( { 'version' => 'v2.0.5_03', '_default' => bless( { 'dna' => 1, 'chunks' => '8', 'forward_reads' => 'FGC0418_s_3.fastq', 'version' => 'v2.0.5_03', 'name' => 'K9_CTRL_OIS', 'index_dir' => '/gpfs/fs121/h/gotte/rum-indexes/hg18', '_default' => bless( { 'bowtie_nu_limit' => 100, 'blat_rep_match' => 256, 'blat_tile_size' => 12, 'max_insertions' => 1, 'blat_max_intron' => 500000, 'blat_min_identity' => 93, 'platform' => 'Local', 'blat_step_size' => 6 }, 'RUM::Config' ), 'platform' => 'SGE', 'output_dir' => '/gpfs/fs121/h/gotte/data/Brian_RASIMR90_05232013/rum/CTRL' }, 'RUM::Config' ), 'read_length' => 50 }, 'RUM::Config' );

The rum_postprocessing file says it can't generate a plan. Same goes for when I run rum_runner status.

2013/06/01 11:10:00 node03.local 13430 INFO RUM.Workflow Starting workflow 'Postprocessing'. I will clean up intermediate temporary files along the way. 2013/06/01 11:10:00 node03.local 13430 FATAL RUM.Death No plan at /gpfs/fs121/h/gotte/rum2/bin/../lib/RUM/Workflow.pm line 518

Thanks

— Reply to this email directly or view it on GitHubhttps://github.com/PGFI/rum/issues/175 .

gotte commented 11 years ago

Hey Mike,

Thanks for looking into this. Here are the things you asked for.

ls -l total 33093056 -rw-r--r-- 1 gotte bergerlab 1023 May 31 19:56 a_rum_K9_OIS.txt drwxr-xr-x 2 gotte bergerlab 16384 May 31 19:50 chunks drwxr-xr-x 2 gotte bergerlab 16384 May 31 19:55 log drwxr-xr-x 2 gotte bergerlab 16384 May 30 14:56 postproc -rw-r--r-- 1 gotte bergerlab 16943548283 May 30 15:36 quals.fa -rw-r--r-- 1 gotte bergerlab 16943548283 May 30 15:36 reads.fa -rw-r--r-- 1 gotte bergerlab 1712 May 30 15:36 rum_job_config -rw-r--r-- 1 gotte bergerlab 3436 May 31 19:55 rum_job_report.txt -rw-r--r-- 1 gotte bergerlab 334 May 30 14:56 rum_K9_OIS_preproc.sh -rw-r--r-- 1 gotte bergerlab 671 May 31 19:55 rum_K9_OIS_proc.sh -rw-r--r-- 1 gotte bergerlab 181 May 31 19:56 rum_sge_job_ids drwxr-xr-x 2 gotte bergerlab 16384 May 30 14:56 tmp

The log folder has a lot of files. I've compressed it. Do you have an email I can send it to?

Thanks,

Gabe

mdelaurentis commented 11 years ago

Gabe,

Actually, it turns out I asked for the wrong thing. Instead of "ls -l", could you just do "find ." in the output directory, redirect the results to a file, send it to me? Please send it to midel@mail.med.upenn.edu.

There doesn't seem to be anything fishy with the options you specified. It looks like the only non-required option you gave was --dna. The only thing that occurs to me right now is that for some reason one or more of the intermediate output files is missing. That could happen if there's some kind of filesystem error. Or it could happen if a program crashes in a certain way, although RUM does have some things in place that usually prevent that from happening. Other times, we've found that users will delete intermediate files in an effort to save space, without realizing that those files are still needed.

Anyway, the error you're seeing, saying it can't create a postprocessing plan, most likely means that some of the intermediate files are lost for some reason. Have you run this multiple times from scratch (in a fresh output directory) and seen the same error? If so, that would indicate a bug in RUM. If on the other hand you've only tried to run it once, it could be some kind of transient system error.

Thanks,

Mike

On Sat, Jun 1, 2013 at 1:44 PM, gotte notifications@github.com wrote:

Hey Mike,

Thanks for looking into this. Here are the things you asked for.

ls -l total 33093056 -rw-r--r-- 1 gotte bergerlab 1023 May 31 19:56 a_rum_K9_OIS.txt drwxr-xr-x 2 gotte bergerlab 16384 May 31 19:50 chunks drwxr-xr-x 2 gotte bergerlab 16384 May 31 19:55 log drwxr-xr-x 2 gotte bergerlab 16384 May 30 14:56 postproc -rw-r--r-- 1 gotte bergerlab 16943548283 May 30 15:36 quals.fa -rw-r--r-- 1 gotte bergerlab 16943548283 May 30 15:36 reads.fa -rw-r--r-- 1 gotte bergerlab 1712 May 30 15:36 rum_job_config -rw-r--r-- 1 gotte bergerlab 3436 May 31 19:55 rum_job_report.txt -rw-r--r-- 1 gotte bergerlab 334 May 30 14:56 rum_K9_OIS_preproc.sh -rw-r--r-- 1 gotte bergerlab 671 May 31 19:55 rum_K9_OIS_proc.sh -rw-r--r-- 1 gotte bergerlab 181 May 31 19:56 rum_sge_job_ids drwxr-xr-x 2 gotte bergerlab 16384 May 30 14:56 tmp

The log folder has a lot of files. I've compressed it. Do you have an email I can send it to?

Thanks,

Gabe

— Reply to this email directly or view it on GitHubhttps://github.com/PGFI/rum/issues/175#issuecomment-18793544 .

mdelaurentis commented 11 years ago

Gabe,

It looks like there's a bug in RUM 2.0.5_03 when using --dna mode. This is an unintended effect of a feature that we introduced in that version, which we actually reverted in the latest version (2.0.5_04). In 2.0.5_03, we added an extra step to the postprocessing where we modify the SAM file based on some information we obtain through the junction calling step. In --dna mode, we don't do that junction calling step, so RUM doesn't know how to create the final SAM file. That feature was short-lived; we decided to remove it in 2.0.5_04 for other reasons.

Anyway, if you can upgrade to 2.0.5_04 and simply run "rum_runner resume --postprocess -o OUTPUT_DIR", I think it should fix your problem. I tried running a very small job using 2.0.5_03 with --dna mode, and I got the same error you did. Then I switched to 2.0.5_04 and did "rum_runner resume --postprocess ..." and it worked fine.

Please let me know if it works for you. Thanks for your patience.

Mike

On Mon, Jun 3, 2013 at 10:33 AM, Mike DeLaurentis delaurentis@gmail.comwrote:

Gabe,

Actually, it turns out I asked for the wrong thing. Instead of "ls -l", could you just do "find ." in the output directory, redirect the results to a file, send it to me? Please send it to midel@mail.med.upenn.edu.

There doesn't seem to be anything fishy with the options you specified. It looks like the only non-required option you gave was --dna. The only thing that occurs to me right now is that for some reason one or more of the intermediate output files is missing. That could happen if there's some kind of filesystem error. Or it could happen if a program crashes in a certain way, although RUM does have some things in place that usually prevent that from happening. Other times, we've found that users will delete intermediate files in an effort to save space, without realizing that those files are still needed.

Anyway, the error you're seeing, saying it can't create a postprocessing plan, most likely means that some of the intermediate files are lost for some reason. Have you run this multiple times from scratch (in a fresh output directory) and seen the same error? If so, that would indicate a bug in RUM. If on the other hand you've only tried to run it once, it could be some kind of transient system error.

Thanks,

Mike

On Sat, Jun 1, 2013 at 1:44 PM, gotte notifications@github.com wrote:

Hey Mike,

Thanks for looking into this. Here are the things you asked for.

ls -l total 33093056 -rw-r--r-- 1 gotte bergerlab 1023 May 31 19:56 a_rum_K9_OIS.txt drwxr-xr-x 2 gotte bergerlab 16384 May 31 19:50 chunks drwxr-xr-x 2 gotte bergerlab 16384 May 31 19:55 log drwxr-xr-x 2 gotte bergerlab 16384 May 30 14:56 postproc -rw-r--r-- 1 gotte bergerlab 16943548283 May 30 15:36 quals.fa -rw-r--r-- 1 gotte bergerlab 16943548283 May 30 15:36 reads.fa -rw-r--r-- 1 gotte bergerlab 1712 May 30 15:36 rum_job_config -rw-r--r-- 1 gotte bergerlab 3436 May 31 19:55 rum_job_report.txt -rw-r--r-- 1 gotte bergerlab 334 May 30 14:56 rum_K9_OIS_preproc.sh -rw-r--r-- 1 gotte bergerlab 671 May 31 19:55 rum_K9_OIS_proc.sh -rw-r--r-- 1 gotte bergerlab 181 May 31 19:56 rum_sge_job_ids drwxr-xr-x 2 gotte bergerlab 16384 May 30 14:56 tmp

The log folder has a lot of files. I've compressed it. Do you have an email I can send it to?

Thanks,

Gabe

— Reply to this email directly or view it on GitHubhttps://github.com/PGFI/rum/issues/175#issuecomment-18793544 .

gotte commented 11 years ago

Hey Mike,

That seems to have done it. Thanks so much!

Best,

Gabe

On Mon, Jun 3, 2013 at 11:40 AM, Mike DeLaurentis notifications@github.comwrote:

Gabe,

It looks like there's a bug in RUM 2.0.5_03 when using --dna mode. This is an unintended effect of a feature that we introduced in that version, which we actually reverted in the latest version (2.0.5_04). In 2.0.5_03, we added an extra step to the postprocessing where we modify the SAM file based on some information we obtain through the junction calling step. In --dna mode, we don't do that junction calling step, so RUM doesn't know how to create the final SAM file. That feature was short-lived; we decided to remove it in 2.0.5_04 for other reasons.

Anyway, if you can upgrade to 2.0.5_04 and simply run "rum_runner resume --postprocess -o OUTPUT_DIR", I think it should fix your problem. I tried running a very small job using 2.0.5_03 with --dna mode, and I got the same error you did. Then I switched to 2.0.5_04 and did "rum_runner resume --postprocess ..." and it worked fine.

Please let me know if it works for you. Thanks for your patience.

Mike

On Mon, Jun 3, 2013 at 10:33 AM, Mike DeLaurentis delaurentis@gmail.comwrote:

Gabe,

Actually, it turns out I asked for the wrong thing. Instead of "ls -l", could you just do "find ." in the output directory, redirect the results to a file, send it to me? Please send it to midel@mail.med.upenn.edu.

There doesn't seem to be anything fishy with the options you specified. It looks like the only non-required option you gave was --dna. The only thing that occurs to me right now is that for some reason one or more of the intermediate output files is missing. That could happen if there's some kind of filesystem error. Or it could happen if a program crashes in a certain way, although RUM does have some things in place that usually prevent that from happening. Other times, we've found that users will delete intermediate files in an effort to save space, without realizing that those files are still needed.

Anyway, the error you're seeing, saying it can't create a postprocessing plan, most likely means that some of the intermediate files are lost for some reason. Have you run this multiple times from scratch (in a fresh output directory) and seen the same error? If so, that would indicate a bug in RUM. If on the other hand you've only tried to run it once, it could be some kind of transient system error.

Thanks,

Mike

On Sat, Jun 1, 2013 at 1:44 PM, gotte notifications@github.com wrote:

Hey Mike,

Thanks for looking into this. Here are the things you asked for.

ls -l total 33093056 -rw-r--r-- 1 gotte bergerlab 1023 May 31 19:56 a_rum_K9_OIS.txt drwxr-xr-x 2 gotte bergerlab 16384 May 31 19:50 chunks drwxr-xr-x 2 gotte bergerlab 16384 May 31 19:55 log drwxr-xr-x 2 gotte bergerlab 16384 May 30 14:56 postproc -rw-r--r-- 1 gotte bergerlab 16943548283 May 30 15:36 quals.fa -rw-r--r-- 1 gotte bergerlab 16943548283 May 30 15:36 reads.fa -rw-r--r-- 1 gotte bergerlab 1712 May 30 15:36 rum_job_config -rw-r--r-- 1 gotte bergerlab 3436 May 31 19:55 rum_job_report.txt -rw-r--r-- 1 gotte bergerlab 334 May 30 14:56 rum_K9_OIS_preproc.sh -rw-r--r-- 1 gotte bergerlab 671 May 31 19:55 rum_K9_OIS_proc.sh -rw-r--r-- 1 gotte bergerlab 181 May 31 19:56 rum_sge_job_ids drwxr-xr-x 2 gotte bergerlab 16384 May 30 14:56 tmp

The log folder has a lot of files. I've compressed it. Do you have an email I can send it to?

Thanks,

Gabe

— Reply to this email directly or view it on GitHub< https://github.com/PGFI/rum/issues/175#issuecomment-18793544> .

— Reply to this email directly or view it on GitHubhttps://github.com/PGFI/rum/issues/175#issuecomment-18849980 .

mdelaurentis commented 11 years ago

Great, I'm glad to hear it.

On Mon, Jun 3, 2013 at 3:52 PM, gotte notifications@github.com wrote:

Hey Mike,

That seems to have done it. Thanks so much!

Best,

Gabe

On Mon, Jun 3, 2013 at 11:40 AM, Mike DeLaurentis notifications@github.comwrote:

Gabe,

It looks like there's a bug in RUM 2.0.5_03 when using --dna mode. This is an unintended effect of a feature that we introduced in that version, which we actually reverted in the latest version (2.0.5_04). In 2.0.5_03, we added an extra step to the postprocessing where we modify the SAM file based on some information we obtain through the junction calling step. In --dna mode, we don't do that junction calling step, so RUM doesn't know how to create the final SAM file. That feature was short-lived; we decided to remove it in 2.0.5_04 for other reasons.

Anyway, if you can upgrade to 2.0.5_04 and simply run "rum_runner resume --postprocess -o OUTPUT_DIR", I think it should fix your problem. I tried running a very small job using 2.0.5_03 with --dna mode, and I got the same error you did. Then I switched to 2.0.5_04 and did "rum_runner resume --postprocess ..." and it worked fine.

Please let me know if it works for you. Thanks for your patience.

Mike

On Mon, Jun 3, 2013 at 10:33 AM, Mike DeLaurentis delaurentis@gmail.comwrote:

Gabe,

Actually, it turns out I asked for the wrong thing. Instead of "ls -l", could you just do "find ." in the output directory, redirect the results to a file, send it to me? Please send it to midel@mail.med.upenn.edu.

There doesn't seem to be anything fishy with the options you specified. It looks like the only non-required option you gave was --dna. The only thing that occurs to me right now is that for some reason one or more of the intermediate output files is missing. That could happen if there's some kind of filesystem error. Or it could happen if a program crashes in a certain way, although RUM does have some things in place that usually prevent that from happening. Other times, we've found that users will delete intermediate files in an effort to save space, without realizing that those files are still needed.

Anyway, the error you're seeing, saying it can't create a postprocessing plan, most likely means that some of the intermediate files are lost for some reason. Have you run this multiple times from scratch (in a fresh output directory) and seen the same error? If so, that would indicate a bug in RUM. If on the other hand you've only tried to run it once, it could be some kind of transient system error.

Thanks,

Mike

On Sat, Jun 1, 2013 at 1:44 PM, gotte notifications@github.com wrote:

Hey Mike,

Thanks for looking into this. Here are the things you asked for.

ls -l total 33093056 -rw-r--r-- 1 gotte bergerlab 1023 May 31 19:56 a_rum_K9_OIS.txt drwxr-xr-x 2 gotte bergerlab 16384 May 31 19:50 chunks drwxr-xr-x 2 gotte bergerlab 16384 May 31 19:55 log drwxr-xr-x 2 gotte bergerlab 16384 May 30 14:56 postproc -rw-r--r-- 1 gotte bergerlab 16943548283 May 30 15:36 quals.fa -rw-r--r-- 1 gotte bergerlab 16943548283 May 30 15:36 reads.fa -rw-r--r-- 1 gotte bergerlab 1712 May 30 15:36 rum_job_config -rw-r--r-- 1 gotte bergerlab 3436 May 31 19:55 rum_job_report.txt -rw-r--r-- 1 gotte bergerlab 334 May 30 14:56 rum_K9_OIS_preproc.sh -rw-r--r-- 1 gotte bergerlab 671 May 31 19:55 rum_K9_OIS_proc.sh -rw-r--r-- 1 gotte bergerlab 181 May 31 19:56 rum_sge_job_ids drwxr-xr-x 2 gotte bergerlab 16384 May 30 14:56 tmp

The log folder has a lot of files. I've compressed it. Do you have an email I can send it to?

Thanks,

Gabe

— Reply to this email directly or view it on GitHub< https://github.com/PGFI/rum/issues/175#issuecomment-18793544> .

— Reply to this email directly or view it on GitHub< https://github.com/PGFI/rum/issues/175#issuecomment-18849980> .

— Reply to this email directly or view it on GitHubhttps://github.com/PGFI/rum/issues/175#issuecomment-18866379 .