NCBI-Hackathons / Scan2CNV

MIT License
1 stars 0 forks source link

test scripts/ArrayScan2CNV.py #21

Closed ekarlins closed 7 years ago

ekarlins commented 7 years ago

"scripts/ArrayScan2CNV.py" is the wrapper for the whole pipeline. All features are not in the pipeline yet, but you can still test the wrapper script. I haven't really tested that it works. I based it off a script I had written just for our NCI cluster, so I know there will be a lot to generalize to get it working on other systems. @mtbrown22 and @slsevilla, try it on the NCI cluster and see if you can get it running. There are several python versions on the cluster ("module avail python" shows them). You can try it in all if you want.

@ngiangre , this definitely won't work for you. But you can help figure out what needs to be changed to get it working.

Please start other issues for anything you see in this script that is not generalized for other users. Some of these you can see just by looking at the help page "./ArrayScan2CNV.py -h". Others you may need to look at the code to find. Also, make note on the README of any dependencies that you see.

slsevilla commented 7 years ago

Getting the following error:

[sevillas2@cgemsIII scripts]$ ./ArrayScan2CNV.py -n/--Trial1 -g/--/CGF/Infinium/ScanData/CGF/ByProject/GP0446/IN1/GTC_Beeline_GP0446-IN1n2_CR97_n838 -d/--/CGF/TempFileSwap/Sam/Hackathon2017 -b/--/CGF/TempFileSwap/Sam/Hackathon2017/GSAMD-24v1-0_20011747_A1.bpm Traceback (most recent call last): File "./ArrayScan2CNV.py", line 115, in main() File "./ArrayScan2CNV.py", line 90, in main paths = os.listdir(outDir) OSError: [Errno 2] No such file or directory: '/--/CGF/TempFileSwap/Sam/Hackathon2017'

mtbrown22 commented 7 years ago

[brownmt2@cgemsIII scripts]$ python ArrayScan2CNV.py -n test1 -g /DCEG/CGF/TempFileSwap/Maria/Hackathon/Global_Screening_Arrays/files/ -d /DCEG/CGF/TempFileSwap/Maria/Hackathon/output/ -b /DCEG/CGF/Infinium/Resources/Manifests/GSAMD-24v1-0_20011747_A1.bpm Traceback (most recent call last): File "ArrayScan2CNV.py", line 115, in main() File "ArrayScan2CNV.py", line 105, in main if args.unlock_snakemake: AttributeError: 'Namespace' object has no attribute 'unlock_snakemake'

mtbrown22 commented 7 years ago

Please check gtc file in files section of this repo. @ekarlins

slsevilla commented 7 years ago

"logs" folder created config.yaml created Snakefile created

mtbrown22 commented 7 years ago

[brownmt2@cgemsIII scripts]$ python ArrayScan2CNV.py -n test2 -g /DCEG/CGF/TempFileSwap/Maria/Hackathon/Global_Screening_Arrays/files/ -d /DCEG/CGF/TempFileSwap/Maria/Hackathon/output/ -b /DCEG/CGF/Infinium/Resources/Manifests/GSAMD-24v1-0_20011747_A1.bpm Traceback (most recent call last): File "ArrayScan2CNV.py", line 116, in main() File "ArrayScan2CNV.py", line 111, in main runQsub(outDir + '/GwasQcPipeline.sh', os.path.basename(outDir), 'seq-alignment.q,seq-calling.q,seq-calling2.q,seq-gvcf.q,research.q') File "ArrayScan2CNV.py", line 32, in runQsub retcode = subprocess.call(qsubCall) File "/DCEG/Resources/Tools/python/2.7.4-shared/lib/python2.7/subprocess.py", line 524, in call return Popen(*popenargs, **kwargs).wait() File "/DCEG/Resources/Tools/python/2.7.4-shared/lib/python2.7/subprocess.py", line 711, in init errread, errwrite) File "/DCEG/Resources/Tools/python/2.7.4-shared/lib/python2.7/subprocess.py", line 1308, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory

slsevilla commented 7 years ago

New Error after changes

[sevillas2@cgemsIII Hackathon2017]$ GitRepo/Global_Screening_Arrays/scripts/ArrayScan2CNV.py -n Trial1 -g /CGF/Infinium/ScanData/CGF/ByProject/GP0446/IN1/GTC_Beeline_GP0446-IN1n2_CR97_n838 -d /CGF/TempFileSwap/Sam/Hackathon2017/Output -b /CGF/TempFileSwap/Sam/Hackathon2017/GSAMD-24v1-0_20011747_A1.bpm Traceback (most recent call last): File "GitRepo/Global_Screening_Arrays/scripts/ArrayScan2CNV.py", line 116, in main() File "GitRepo/Global_Screening_Arrays/scripts/ArrayScan2CNV.py", line 111, in main runQsub(outDir + '/GwasQcPipeline.sh', os.path.basename(outDir), 'seq-alignment.q,seq-calling.q,seq-calling2.q,seq-gvcf.q,research.q') File "GitRepo/Global_Screening_Arrays/scripts/ArrayScan2CNV.py", line 32, in runQsub retcode = subprocess.call(qsubCall) File "/DCEG/Resources/Tools/python/2.7.8-shared/lib/python2.7/subprocess.py", line 522, in call return Popen(*popenargs, **kwargs).wait() File "/DCEG/Resources/Tools/python/2.7.8-shared/lib/python2.7/subprocess.py", line 710, in init errread, errwrite) File "/DCEG/Resources/Tools/python/2.7.8-shared/lib/python2.7/subprocess.py", line 1327, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory

"logs" folder created config.yaml created Snakefile created GwasQCPipeline created

mtbrown22 commented 7 years ago

Job submitted.

[brownmt2@cgemsIII scripts]$ python Scan2CNV.py -n test2 -g /DCEG/CGF/TempFileSwap/Maria/Hackathon/Global_Screening_Arrays/files/ -d /DCEG/CGF/TempFileSwap/Maria/Hackathon/output/ -b /DCEG/CGF/Infinium/Resources/Manifests/GSAMD-24v1-0_20011747_A1.bpm -m -u Your job 6028273 ("Scan2CNV.") has been submitted ArrayScan2CNV Pipeline submitted. You should receive an email when the pipeline starts and when it completes.

Completed email:

Job 6028273 (Scan2CNV.) Complete User = brownmt2 Queue = all.q@node017.cm.cluster Host = node017.cm.cluster Start Time = 03/22/2017 11:38:53 End Time = 03/22/2017 11:38:57 User Time = 00:00:01 System Time = 00:00:00 Wallclock Time = 00:00:04 CPU = 00:00:01 Max vmem = 366.277M Exit Status = 1

There are no files in the "logs" folder.

[brownmt2@cgemsIII output]$ tail Scan2CNV.stderr KeyError in line 21 of /mnt/nfs/gigantor/ifs/DCEG/CGF/TempFileSwap/Maria/Hackathon/output/Snakefile: 'project_name' File "/mnt/nfs/gigantor/ifs/DCEG/CGF/TempFileSwap/Maria/Hackathon/output/Snakefile", line 21, in KeyError in line 21 of /mnt/nfs/gigantor/ifs/DCEG/CGF/TempFileSwap/Maria/Hackathon/output/Snakefile: 'project_name' File "/mnt/nfs/gigantor/ifs/DCEG/CGF/TempFileSwap/Maria/Hackathon/output/Snakefile", line 21, in KeyError in line 21 of /mnt/nfs/gigantor/ifs/DCEG/CGF/TempFileSwap/Maria/Hackathon/output/Snakefile: 'project_name' File "/mnt/nfs/gigantor/ifs/DCEG/CGF/TempFileSwap/Maria/Hackathon/output/Snakefile", line 21, in

ekarlins commented 7 years ago

@mtbrown22, is there 'project_name' in the config.yaml file?

mtbrown22 commented 7 years ago

No, there isn't a 'project_name'.

mtbrown22 commented 7 years ago

config.yaml file

gtc_dir: /DCEG/CGF/TempFileSwap/Maria/Hackathon/Global_Screening_Arrays/files/ output_dir: /DCEG/CGF/TempFileSwap/Maria/Hackathon/output/ bpm: /DCEG/CGF/Infinium/Resources/Manifests/GSAMD-24v1-0_20011747_A1.bpm repo_scripts: /mnt/nfs/gigantor/ifs/DCEG/CGF/TempFileSwap/Maria/Hackathon/Global_Screening_Arrays/scripts start_time: Wed Mar 22 11:38:38 2017

@ekarlins

ekarlins commented 7 years ago

Thanks @mtbrown22 ! It is supposed to get the 'project_name' variable from the config file. I've updated config to have this. Try running now.

mtbrown22 commented 7 years ago

It works! I ran with more than 1 file this time.

[brownmt2@cgemsIII scripts]$ python Scan2CNV.py -n test2 -g /DCEG/CGF/TempFileSwap/Maria/Hackathon/files/ -d /DCEG/CGF/TempFileSwap/Maria/Hackathon/output/ -b /DCEG/CGF/Infinium/Resources/Manifests/GSAMD-24v1-0_20011747_A1.bpm -m

Job 6028694 (Scan2CNV.) Complete User = brownmt2 Queue = all.q@node020.cm.cluster Host = node020.cm.cluster Start Time = 03/22/2017 12:45:36 End Time = 03/22/2017 12:50:19 User Time = 00:00:01 System Time = 00:00:00 Wallclock Time = 00:04:43 CPU = 00:00:02 Max vmem = 777.020M Exit Status = 0

@ekarlins

mtbrown22 commented 7 years ago

Ran the following with the created pfb file:

[brownmt2@cgemsIII scripts]$ python Scan2CNV.py -n test2cnv -g /DCEG/CGF/TempFileSwap/Maria/Hackathon/files/ -b /DCEG/CGF/Infinium/Resources/Manifests/GSAMD-24v1-0_20011747_A1.bpm -p /DCEG/CGF/TempFileSwap/Maria/Hackathon/output/PFB/test2.pfb -d /DCEG/CGF/TempFileSwap/Maria/Hackathon/pfb_output/ Your job 6028732 ("Scan2CNV.") has been submitted Scan2CNV Pipeline submitted. You should receive an email when the pipeline starts and when it completes.

Received this error:

SyntaxError in line 61 of /mnt/nfs/gigantor/ifs/DCEG/CGF/TempFileSwap/Maria/Hackathon/pfb_output/Snakefile: EOF in multi-line statement (Snakefile, line 61) SyntaxError in line 61 of /mnt/nfs/gigantor/ifs/DCEG/CGF/TempFileSwap/Maria/Hackathon/pfb_output/Snakefile: EOF in multi-line statement (Snakefile, line 61)

@ekarlins

ekarlins commented 7 years ago

git pull and try again. It needs a hmm argument now too.

Sent from my iPhone

On Mar 22, 2017, at 2:13 PM, mtbrown22 notifications@github.com wrote:

Ran the following with the created pfb file:

[brownmt2@cgemsIII scripts]$ python Scan2CNV.py -n test2cnv -g /DCEG/CGF/TempFileSwap/Maria/Hackathon/files/ -b /DCEG/CGF/Infinium/Resources/Manifests/GSAMD-24v1-0_20011747_A1.bpm -p /DCEG/CGF/TempFileSwap/Maria/Hackathon/output/PFB/test2.pfb -d /DCEG/CGF/TempFileSwap/Maria/Hackathon/pfb_output/ Your job 6028732 ("Scan2CNV.") has been submitted Scan2CNV Pipeline submitted. You should receive an email when the pipeline starts and when it completes.

Received this error:

SyntaxError in line 61 of /mnt/nfs/gigantor/ifs/DCEG/CGF/TempFileSwap/Maria/Hackathon/pfb_output/Snakefile: EOF in multi-line statement (Snakefile, line 61) SyntaxError in line 61 of /mnt/nfs/gigantor/ifs/DCEG/CGF/TempFileSwap/Maria/Hackathon/pfb_output/Snakefile: EOF in multi-line statement (Snakefile, line 61)

@ekarlins

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

mtbrown22 commented 7 years ago

Job submitted successfully:

python Scan2CNV.py -n test2cnv -g /DCEG/CGF/TempFileSwap/Maria/Hackathon/files/ -b /DCEG/CGF/Infinium/Resources/Manifests/GSAMD-24v1-0_20011747_A1.bpm -p /DCEG/CGF/TempFileSwap/Maria/Hackathon/output/PFB/test2.pfb -hmm /DCEG/Resources/Tools/PennCNV/PennCNV-1.0.3/example/example.hmm -d /DCEG/CGF/TempFileSwap/Maria/Hackathon/pfb_output/

Your job 6028798 ("Scan2CNV.") has been submitted Scan2CNV Pipeline submitted. You should receive an email when the pipeline starts and when it completes.

It didn't complete successfully. Output: /DCEG/CGF/TempFileSwap/Maria/Hackathon/pfb_output

@ekarlins

ekarlins commented 7 years ago

This script is running now. Obviously this will be a work in progress as we add more stuff, but I'm closing this issue since it's not a priority for others to test this right now. Feel free to run the script, though.