idinov / informatics

Automatically exported from code.google.com/p/informatics
0 stars 0 forks source link

CNV2 issue still open #18

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Hi,
I saw the the CNV2(CNVer) issue has been closed but the problem is still there 
(see issue#8).

I have this error:
mkdir /ifs: Permission denied at /applications/CNVer/cnver-0.8.1/src/cnver.pl 
line 100

I have full x permission on the script as you can see. Do tou have the same 
issue?

Federica

Original issue reported on code.google.com by federica...@gmail.com on 18 Aug 2011 at 5:23

GoogleCodeExporter commented 9 years ago
Just noticed this... are you still having this issue? This workflow still 
references cranium /ifs directories. I'm re-running it now with fgene1 
directories.

You can do the same, just change the "ref folder" and "work dir" parameters to 
point to fgene directories. For work dir, I used my "home" directory 
/projects2/alexgenco, and for ref folder, i just used the parent directory of 
the reference file, i.e. 
/projects1/Reference_genomes/hg18_ensembl/gatk-canonical

Original comment by alexge...@gmail.com on 25 Aug 2011 at 7:43

GoogleCodeExporter commented 9 years ago
Yes, I had this issue until friday. I didn't know about this parameters and you 
are right is sounds this is the problem..let me know if your run works, I will 
try as well. If everything works feel free to copy it in the server library and 
to send me a copy so I can keep track of the changes and the errors.

thanks!!!

federica

Original comment by federica...@gmail.com on 26 Aug 2011 at 8:07

GoogleCodeExporter commented 9 years ago
..in fact I ahve a question: how can I change the path? were is it specified? I 
checked the cnver.sh script but I didn't find those specifications....thanks, 
so I can learn where to make changes.. :)

Fed

Original comment by federica...@gmail.com on 26 Aug 2011 at 8:11

Attachments:

GoogleCodeExporter commented 9 years ago
to set module parameters, just double click the corresponding input circle 
above the module. You can hover over each circle for a second to see which 
parameter it corresponds to. 

These are a little old, but they might be helpful to you.
http://pipeline.loni.ucla.edu/support/screencasts/

My run for this workflow got disconnected, I'll try it again though.

Original comment by alexge...@gmail.com on 29 Aug 2011 at 6:41

GoogleCodeExporter commented 9 years ago
Hi, thanks for your help. I have changed the work and ref dir accordingly. Now, 
another weird problem has arisen not in the CNVer run, but in the BOWTIE 
alignment (thing that was not arising before). I had this error:

Extra parameter(s) specified: 
"/usr/pl_cache/pipeline/2011September06_14h17m35s444ms/Bowtie_1.OutputSam-1.sam"
Note that if <mates> files are specified using -1/-2, a <singles> file cannot
also be specified.  Please run bowtie separately for mates and singles.
Command: /applications/BOWTIE/bowtie-0.12.7/bowtie -1 
/projects2/USC/canonical-test-data/hg18_ensembl-bwa/mini-bam/s_1_1_sequence.150k
.txt -2 
/projects2/USC/canonical-test-data/hg18_ensembl-bwa/mini-bam/s_1_2_sequence.150k
.txt --sam /projects1/test_pipeline/NEW_ORGANIZATION/CNV/temp_CNV2 '-v 3 -a -m 
600 --best --strata' 
/usr/pl_cache/pipeline/2011September06_14h17m35s444ms/Bowtie_1.OutputSam-1.sam 

The command line is absolutely correct (-1 and -2 flag for fwd and rev 
reads...I tried the command line, and I it works even if doesn't give any 
alignment:

/applications/BOWTIE/bowtie-0.12.7/bowtie -1 
/projects2/USC/canonical-test-data/hg18_ensembl-bwa/mini-bam/s_1_1_sequence.150k
.txt -2 
/projects2/USC/canonical-test-data/hg18_ensembl-bwa/mini-bam/s_1_2_sequence.150k
.txt --sam /projects1/test_pipeline/NEW_ORGANIZATION/CNV/temp_CNV2 '-v 3 -a -m 
600 --best --strata' 
/usr/pl_cache/pipeline/2011September06_14h17m35s444ms/Bowtie_1.OutputSam-1.sam 

# reads processed: 37500
# reads with at least one reported alignment: 0 (0.00%)
# reads that failed to align: 37500 (100.00%)
No alignments

the input file with the reads is exactly the same as I was using before. I 
attach the pipeline. Have you experienced the same error?

federica

Original comment by federica...@gmail.com on 6 Sep 2011 at 9:24

Attachments:

GoogleCodeExporter commented 9 years ago
yes, I'm getting this same error in the bowtie module. My hunch is it has to do 
with '-v 3 -a -m 600 --best --strata' being in single quotes, but I could be 
wrong. I've attached a fix for this, but it's probably not the only issue.

Has there been a change in the fgene1 directory structure? Most of the files 
from this workflow don't exist anymore, and it doesn't pass validation. I also 
no longer can ssh into fgene1. I'll shoot Bernard an email and see what's up. 

Original comment by alexge...@gmail.com on 7 Sep 2011 at 6:28

Attachments:

GoogleCodeExporter commented 9 years ago
Yes, I am asking myself the same questions! I will try the workflow on ALL 
fgene...but it is really weird....

alex, do I have your email? Just to be able to put you guys in cc if I send 
some technical email...:)

Federica

Original comment by federica...@gmail.com on 7 Sep 2011 at 6:42

GoogleCodeExporter commented 9 years ago
alexgenco@gmail.com 
thanks!

Original comment by alexge...@gmail.com on 7 Sep 2011 at 7:00

GoogleCodeExporter commented 9 years ago
Ah sam, did u write to Bernard about this problem?

Fed

Original comment by federica...@gmail.com on 8 Sep 2011 at 10:07

GoogleCodeExporter commented 9 years ago
Sorry, alex...not sam :)

Original comment by federica...@gmail.com on 8 Sep 2011 at 11:28

GoogleCodeExporter commented 9 years ago
No I haven't yet, it resolved itself. If it happens more, I'll email him.

Original comment by alexge...@gmail.com on 9 Sep 2011 at 4:27

GoogleCodeExporter commented 9 years ago
Ciao Alex,

so just some comments about weird things happening ..just to ask what is your 
feeling about them.

SO, lets talk about this module.

I have a completely different behavior on the three fgene, even if they are 
supposed to work the same way.

For example:

-validation: 
-on fgene1 the input seems not to be there..how is possible? were u eworking on 
fgene1 right? SO, were you getting this error, bc if yes do I ahve to suppose 
that ther eis a problem in permissions? 
-fgene2 and fgene3: no problem everything is green.

-run: on fgene1 is not possible to run it as the input file are not visible. On 
fgene2 the run stops at samtools converter..(error strem: ERROR: Unrecognized 
option: 'INPUT) and on fgene3: magic it is running. I swear that the parameters 
are the same, the work dir and ref dir are the same, i only change server. 

So, for me it is important understand why there is this behavior because I have 
to use all the servers to run also REAL gigantic data...and I have to explain 
properly the problem to Bernard...

SO do you have the feeling is a connection/permission/user problem? Maybe I 
have to write to Petros about that..

Thank you so much for your help in advance,

fed

Original comment by federica...@gmail.com on 9 Sep 2011 at 6:09

GoogleCodeExporter commented 9 years ago
here the pics

Original comment by federica...@gmail.com on 9 Sep 2011 at 6:11

Attachments:

GoogleCodeExporter commented 9 years ago
good news, on fgene3 it worked! The output files are all there, empty because I 
think the reads were too few to make any call (as these were mini-test files. I 
wilk be back to you on real data..but it ran properly!

Original comment by federica...@gmail.com on 9 Sep 2011 at 6:27

GoogleCodeExporter commented 9 years ago
I reproduced those fgene1 validation errors yesterday exactly (can't connect 
right now). The files are all where they're supposed to be, but it seems like 
our pipeline accounts don't have permission to read them. I even made a simple 
module to run /bin/ls and it failed with a permission denied message. This 
definitely points to a problem with pipeline user accounts.

I will email Bernard right now and cc you. Maybe he's already addressing this.

Original comment by alexge...@gmail.com on 9 Sep 2011 at 6:50

GoogleCodeExporter commented 9 years ago
Oh, thanks! :)

Original comment by federica...@gmail.com on 9 Sep 2011 at 6:52

GoogleCodeExporter commented 9 years ago
I'm getting new errors in the CNVer module for this. The source seems to be 
hardcoded to look for subdirectories of the --ref_folder that we specify...

Folder 
/projects1/Reference_genomes/hg18_ensembl/gatk-canonical/contig_breaks_folder 
does not exist! at /applications/CNVer/cnver-0.8.1/src/cnver.pl line 111.

What am I supposed to use as the reference folder? Is there some global 
reference folder that contains subdirectories such as:

$ref_folder/contig_breaks_folder
$ref_folder/repeat_regions_folder
$ref_folder/self_alignments_folder
$ref_folder/fasta_files_folder
$ref_folder/autosomes.txt
...etc.

 I just somewhat arbitrarily chose the folder that contains the reference file data source, but that seems to be wrong.

Alex

Original comment by alexge...@gmail.com on 13 Sep 2011 at 8:16

GoogleCodeExporter commented 9 years ago
No, this reference folder is something that is provided form the developer of 
CNVer, and I have copied into:

/applications/CNVer/cnver-0.8.1/hg18comp

here there are all teh subfolders that are needed. For me this module worked 
well.

Fed

Original comment by federica...@gmail.com on 15 Sep 2011 at 5:36

GoogleCodeExporter commented 9 years ago
yep, this works for me now as well.

Original comment by alexge...@gmail.com on 15 Sep 2011 at 6:43

GoogleCodeExporter commented 9 years ago
ok , perfect

Original comment by federica...@gmail.com on 15 Sep 2011 at 6:55

GoogleCodeExporter commented 9 years ago
Hi Alex,

I figured out the problem about the empty result file for CNVer. If you look to 
the BOWTIE alignment module it tells you:

# reads processed: 94360373
# reads with at least one reported alignment: 0 (0.00%)
# reads that failed to align: 94360373 (100.00%)
No alignments

BUT the problem cannot be BOWTIE because the same bowtie is running in the 
BOWTIE alignment module...

The only guess that comes to my mind is that the .ebwt produced by the build 
step MUST be in the same folder with the original reference .fasta 
file..looking to the pipe now...where are the .ebwt going? To a  temp folder? 
Would be useful to redirect them in the folder where is the original reference 
genome ..this only a guess..are you having the same result (that means all the 
pipe green BUT all the .cnvs file in the /calls folder empty?)

Fed

Original comment by federica...@gmail.com on 28 Sep 2011 at 10:06

Attachments:

GoogleCodeExporter commented 9 years ago
Hello all,

Federica and I have some progress to report on testing the CNVer pipeline. I 
hope I'm replying to the correct ticket, so here goes!

Previously, the pipeline passed validation and executed, but produced an output 
SAM file with 0 mapped reads from bowtie. This obviously precluded the 
subsequent CNV analysis stages, since there were no reads to analyze.

But now we have a few minor changes to report that should fix this, based on 
our latest tests:

1) We updated the version of bowtie-build to v0.12.7 (on the fgenes in: 
/applications/BOWTIE/bowtie-0.12.7). It was previously v0.10.0, but it's no 
longer necessary to use the older version.

2) We *disabled* the '-c' parameter passed to bowtie-build. This option is used 
when you pass a DNA sequence directly through the bowtie-build command line for 
indexing (and conversion into an .ebwt file). 

E.G.: bowtie-build -c 'ATATCGAGATACAGATA'

It's not necessary when passing a file containing sequences to index.

3) We enabled the '-f' option to bowtie-build to tell it the input sequence 
file is in FASTA format. Not sure if this option was enabled originally or not, 
so I thought I'd mention it again.

4) We made a slight change to the EBWT prefix path argument, but this was just 
a logical/organizational change. 

After making these changes, the pipeline still passes validation, takes 
substantially longer on the bowtie-build stage (which is what we'd expect, 
since it is indexing the human reference genome it should take about 3 hours, I 
think), and actually is mapping reads to reference genome -- based on the SAM 
file output.  

We are still awaiting results from the complete pipeline run, specifically the 
later portions of the pipeline, (i.e. the actual CNV analysis) because of some 
fgene server glitches and slow bowtie runtimes that delayed us. This is just a 
running update on what we've figured out so far.  

Original comment by apcexcha...@gmail.com on 30 Sep 2011 at 3:00

Attachments:

GoogleCodeExporter commented 9 years ago
sounds good. That -c flag is a good catch, that would definitely throw off the 
results.

I'm running this now.

Alex

Original comment by alexge...@gmail.com on 30 Sep 2011 at 7:47