taylor-lab / hotspots

Identifying recurrent mutations in cancer
http://www.ncbi.nlm.nih.gov/pubmed/26619011
GNU Affero General Public License v3.0
37 stars 23 forks source link

Error in creating ___tmp.tsv #11

Closed caloto closed 6 years ago

caloto commented 6 years ago

https://github.com/taylor-lab/hotspots/blob/733c727bd4b9f7c1a7f4508b9a467b2f31cacf33/make_trinuc_maf.py#L48

When I run the code, it gives back an error here. No ___tmp.tsv file in the directory. No such file has been created before. Can you please help me??

Thank you!

caloto commented 6 years ago

It gives me: ... Making bed file ... Getting regions Traceback (most recent call last): File "make_trinuc_maf.py", line 48, in subprocess.call("bedtools getfasta -tab -fi /ifs/depot/assemblies/H.sapiens/GRCh37/gr37.fasta -bed tmp.bed -fo tmp.tsv".split(" ")) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 522, in call return Popen(*popenargs, **kwargs).wait() File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 710, in init errread, errwrite) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1335, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory Error in read.table(file = file, header = header, sep = sep, quote = quote, : no lines available in input Calls: read.csv -> read.table Execution halted

kpjonsson commented 6 years ago

@caloto As you can see in the error message, the call to bedtools include a hardcoded path to a fasta file. Unless you're working on the luna cluster within MSK, you won't have access to that file.

caloto commented 6 years ago

@kpjonsson Thank you so much for your answer! Is there any way to fix it without being in luna cluster?

kpjonsson commented 6 years ago

You can download the assembly here: https://grch37.ensembl.org/Homo_sapiens/Info/Index

Note, only use GRCh37 if that's the genome version used in your MAF file.

caloto commented 6 years ago

Ok, very helpful! Thank you!

caloto commented 6 years ago

Thank you for your time.

I have followed your recommendations (thank you so much) downloading GRCh37 human genome assembly as "Homo_sapiens.GRCh37.dna.primary_assembly.fa". Then, I have tried to call this file just changing the "subprocess.call" line as follows:

Fetch regions

print " ... Getting regions" subprocess.call("bedtools getfasta -tab -fi Homo_sapiens.GRCh37.dna.primary_assembly.fa -bed tmp.bed -fo tmp.tsv".split(" "))

I have tried also: subprocess.call("bedtools getfasta -tab -fi ./Homo_sapiens.GRCh37.dna.primary_assembly.fa -bed tmp.bed -fo tmp.tsv".split(" "))

and:

subprocess.call("bedtools getfasta -tab -fi /Users/lab02/Desktop/hotspots-master/Homo_sapiens.GRCh37.dna.primary_assembly.fa -bed tmp.bed -fo tmp.tsv".split(" "))

and:

subprocess.call("bedtools getfasta -tab -fi ./Users/lab02/Desktop/hotspots-master/Homo_sapiens.GRCh37.dna.primary_assembly.fa -bed tmp.bed -fo tmp.tsv".split(" "))

Any of the previous changes have resulted in the same error:

... Making bed file ... Getting regions Traceback (most recent call last): File "make_trinuc_maf.py", line 48, in subprocess.call("bedtools getfasta -tab -fi /ifs/depot/assemblies/H.sapiens/GRCh37/gr37.fasta -bed tmp.bed -fo tmp.tsv".split(" ")) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 522, in call return Popen(*popenargs, **kwargs).wait() File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 710, in init errread, errwrite) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1335, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory Error in read.table(file = file, header = header, sep = sep, quote = quote, : no lines available in input Calls: read.csv -> read.table Execution halted

Can you please help me with this issue??

Thank you very much for your time and attention!

kpjonsson commented 6 years ago

Judging by that error message the script is still trying to find the fasta file located at the hardcoded path on our cluster.

caloto commented 6 years ago

That is my idea, but I cannot really fix it

caloto commented 6 years ago

Do you have any idea why the program is still looking your directory?

kpjonsson commented 6 years ago

How exactly are you running it?

caloto commented 6 years ago

I have all the files in the same folder. Using the terminal - and in the folder's directory-, I paste:

./hotspot_algo.R \ --input-maf=pancancer_unfiltered.maf \ --rdata=hotspot_algo.Rdata \ --gene-query=genes_of_interest.txt \ --output-file=sig_hotspots.txt

and press enter. Of course, all this process modifying the original line in make_trinuc_maf.py:

subprocess.call("bedtools getfasta -tab -fi /ifs/depot/assemblies/H.sapiens/GRCh37/gr37.fasta -bed tmp.bed -fo tmp.tsv".split(" "))

for:

  1. subprocess.call("bedtools getfasta -tab -fi ./Homo_sapiens.GRCh37.dna.primary_assembly.fa -bed tmp.bed -fo tmp.tsv".split(" "))

  2. subprocess.call("bedtools getfasta -tab -fi /Users/lab02/Desktop/hotspots-master/Homo_sapiens.GRCh37.dna.primary_assembly.fa -bed tmp.bed -fo tmp.tsv".split(" "))

  3. subprocess.call("bedtools getfasta -tab -fi ./Users/lab02/Desktop/hotspots-master/Homo_sapiens.GRCh37.dna.primary_assembly.fa -bed tmp.bed -fo tmp.tsv".split(" "))

That's all. Thank you again for your attention to this issue.

kpjonsson commented 6 years ago

I'm not sure what's wrong. I don't see how you can get that error message if you have modified that line of code.

caloto commented 6 years ago

That's right, it is so mysterious... Could be something inside the subprocess?

caloto commented 6 years ago

The error has been solved. If someone suffers from this, the answer would be reinstall bedtool package from Hombrew instead of any other server.

Thank you very much for your help and attention!!

kpjonsson commented 6 years ago

Great.