EBI-COMMUNITY / ebi-parasite

GNU General Public License v3.0
3 stars 2 forks source link

download_and_index_genome_files.py: Error: name 'picard' is not defined #5

Open MostafaYA opened 4 years ago

MostafaYA commented 4 years ago

Hi, thanks for this great work. I tried to reproduce the results using the README instructions. However I got this error. NameError: name 'picard' is not defined

Here what I have done (after adjusting paths in the file "properties.txt" ) ./download_and_index_genome_files.py -p ~/ebi-parasite/properties.txt -g cryptosporidium_hominis this is the output

genome_name: cryptosporidium_hominis

 Properties attributes:
{'propertyf': '/home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/properties.txt', 'available_genomes': '/home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/genome_list.txt', 'workdir': '/home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone', 'snpeff_dir': '/home/software/snpEff/', 'snpeff_config': '/home/software/snpEff/snpEff.config', 'r_script': '/home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/create_all_images.R', 'r_snp_dist_on_chr': '/home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/SNP_dist_on_chrom.R'}
initiating...
executing...
cmd=curl -s https://cryptodb.org/common/downloads/Current_Release/ChominisUdeA01//gff/data/ | grep buildNumber | awk '{split($0,a,"\"");print a[2]}'
curl -s https://cryptodb.org/common/downloads/Current_Release/ChominisUdeA01//gff/data/ | grep buildNumber | awk '{split($0,a,"\"");print a[2]}'
version=46
curl https://cryptodb.org/common/downloads/Current_Release/ChominisUdeA01//fasta/data/CryptoDB-46_ChominisUdeA01_Genome.fasta -o /home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone/ref/cryptosporidium_hominis.fasta
cmd=curl -s https://cryptodb.org/common/downloads/Current_Release/ChominisUdeA01//gff/data/ | grep buildNumber | awk '{split($0,a,"\"");print a[2]}'
curl -s https://cryptodb.org/common/downloads/Current_Release/ChominisUdeA01//gff/data/ | grep buildNumber | awk '{split($0,a,"\"");print a[2]}'
version=46
curl https://cryptodb.org/common/downloads/Current_Release/ChominisUdeA01//gff/data/CryptoDB-46_ChominisUdeA01.gff -o /home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone/ref/cryptosporidium_hominis.gff3
samtools faidx /home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone/ref/cryptosporidium_hominis.fasta
bwa index /home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone/ref/cryptosporidium_hominis.fasta
bowtie2-build /home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone/ref/cryptosporidium_hominis.fasta cryptosporidium_hominis
Traceback (most recent call last):
  File "./download_and_index_genome_files.py", line 78, in <module>
    execute()
  File "./download_and_index_genome_files.py", line 67, in execute
    dl_index_genome_files()
  File "./download_and_index_genome_files.py", line 55, in dl_index_genome_files
    .format(picard,REF_FASTA,re.findall("(.*?).fa",REF_FASTA)[0]+".dict")).run_comm(0)
NameError: name 'picard' is not defined

I have picard installed java -jar /home/software/picard/picard.jar . Therfore I tried to replce picard in the script with"java -jar /home/software/picard/picard.jar" Dont know if it was a good idea, but finally I got a different error

properties_file: /home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/properties.txt
genome_name: cryptosporidium_hominis

 Properties attributes:
{'propertyf': '/home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/properties.txt', 'available_genomes': '/home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/genome_list.txt', 'workdir': '/home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone', 'snpeff_dir': '/home/software/snpEff/', 'snpeff_config': '/home/software/snpEff/snpEff.config', 'r_script': '/home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/create_all_images.R', 'r_snp_dist_on_chr': '/home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/SNP_dist_on_chrom.R'}
initiating...
executing...
cmd=curl -s https://cryptodb.org/common/downloads/Current_Release/ChominisUdeA01//gff/data/ | grep buildNumber | awk '{split($0,a,"\"");print a[2]}'
curl -s https://cryptodb.org/common/downloads/Current_Release/ChominisUdeA01//gff/data/ | grep buildNumber | awk '{split($0,a,"\"");print a[2]}'
version=46
curl https://cryptodb.org/common/downloads/Current_Release/ChominisUdeA01//fasta/data/CryptoDB-46_ChominisUdeA01_Genome.fasta -o /home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone/ref/cryptosporidium_hominis.fasta
cmd=curl -s https://cryptodb.org/common/downloads/Current_Release/ChominisUdeA01//gff/data/ | grep buildNumber | awk '{split($0,a,"\"");print a[2]}'
curl -s https://cryptodb.org/common/downloads/Current_Release/ChominisUdeA01//gff/data/ | grep buildNumber | awk '{split($0,a,"\"");print a[2]}'
version=46
curl https://cryptodb.org/common/downloads/Current_Release/ChominisUdeA01//gff/data/CryptoDB-46_ChominisUdeA01.gff -o /home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone/ref/cryptosporidium_hominis.gff3
samtools faidx /home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone/ref/cryptosporidium_hominis.fasta
bwa index /home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone/ref/cryptosporidium_hominis.fasta
bowtie2-build /home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone/ref/cryptosporidium_hominis.fasta cryptosporidium_hominis
java -jar /home/software/picard/picard.jar CreateSequenceDictionary REFERENCE=/home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone/ref/cryptosporidium_hominis.fasta OUTPUT=/home/most.dict

ERROR: java -jar /home/software/picard/picard.jar CreateSequenceDictionary REFERENCE=/home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone/ref/cryptosporidium_hominis.fasta OUTPUT=/home/most.dict FAILED!!!

STDERR: b'INFO\t2020-04-16 11:21:20\tCreateSequenceDictionary\t\n\n********** NOTE: Picard\'s command line syntax is changing.\n**********\n********** For more information, please see:\n********** https://github.com/broadinstitute/picard/wiki/Command-Line-Syntax-Transition-For-Users-(Pre-Transition)\n**********\n********** The command line looks like this in the new syntax:\n**********\n**********    CreateSequenceDictionary -REFERENCE /home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone/ref/cryptosporidium_hominis.fasta -OUTPUT /home/most.dict\n**********\n\n\n11:21:21.521 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/software/picard/picard.jar!/com/intel/gkl/native/libgkl_compression.so\n[Thu Apr 16 11:21:21 CEST 2020] CreateSequenceDictionary OUTPUT=/home/most.dict REFERENCE=/home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone/ref/cryptosporidium_hominis.fasta    TRUNCATE_NAMES_AT_WHITESPACE=true NUM_SEQUENCES=2147483647 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false\n[Thu Apr 16 11:21:21 CEST 2020] Executing as mostafa.abdel@je-seq160020.intern.fli.bund.local on Linux 3.10.0-1062.18.1.el7.x86_64 amd64; OpenJDK 64-Bit Server VM 1.8.0_242-b08; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.18.29-SNAPSHOT\n[Thu Apr 16 11:21:21 CEST 2020] picard.sam.CreateSequenceDictionary done. Elapsed time: 0.00 minutes.\nRuntime.totalMemory()=2058354688\nTo get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp\nException in thread "main" picard.PicardException: File /home/most.dict not found\n\tat picard.sam.CreateSequenceDictionary.doWork(CreateSequenceDictionary.java:237)\n\tat picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:295)\n\tat picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103)\n\tat picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:113)\n'
STDOUT: b''
MoggoB commented 4 years ago

Hello, I came across the same mistake, so I changed line 54 like this:

    if not os.path.isfile(ref_dict_fpath):
        command("picard CreateSequenceDictionary REFERENCE={} OUTPUT {}".format(REF_FASTA,re.findall("(.*?).fa",REF_FASTA)[0]+".dict")).run_comm(0)   

I don't get any error messages now. I installed picard using conda. I hope this can help you.

MostafaYA commented 4 years ago

Hi, thanks for your answer, I installed picard using conda and changed line 54 as indicated. still I am getting this error (2nd error above)

ERROR: picard CreateSequenceDictionary REFERENCE=/home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone/ref/cryptosporidium_hominis.fasta OUTPUT=/home/most.dict FAILED!!! 

I feel that picard want to write an output file most.dict directly in/home/ which I am not allowed to, becuase of write privilages.

This is the full error


properties_file: /home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/properties.txt
genome_name: cryptosporidium_hominis

 Properties attributes:
{'propertyf': '/home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/properties.txt', 'available_genomes': '/home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/genome_list.txt', 'workdir': '/home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone', 'snpeff_dir': '/home/software/snpEff/', 'snpeff_config': '/home/software/snpEff/snpEff.config', 'r_script': '/home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/create_all_images.R', 'r_snp_dist_on_chr': '/home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/SNP_dist_on_chrom.R'}
initiating...
executing...
cmd=curl -s https://cryptodb.org/common/downloads/Current_Release/ChominisUdeA01//gff/data/ | grep buildNumber | awk '{split($0,a,"\"");print a[2]}'
curl -s https://cryptodb.org/common/downloads/Current_Release/ChominisUdeA01//gff/data/ | grep buildNumber | awk '{split($0,a,"\"");print a[2]}'
version=46
curl https://cryptodb.org/common/downloads/Current_Release/ChominisUdeA01//fasta/data/CryptoDB-46_ChominisUdeA01_Genome.fasta -o /home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone/ref/cryptosporidium_hominis.fasta
cmd=curl -s https://cryptodb.org/common/downloads/Current_Release/ChominisUdeA01//gff/data/ | grep buildNumber | awk '{split($0,a,"\"");print a[2]}'
curl -s https://cryptodb.org/common/downloads/Current_Release/ChominisUdeA01//gff/data/ | grep buildNumber | awk '{split($0,a,"\"");print a[2]}'
version=46
curl https://cryptodb.org/common/downloads/Current_Release/ChominisUdeA01//gff/data/CryptoDB-46_ChominisUdeA01.gff -o /home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone/ref/cryptosporidium_hominis.gff3
samtools faidx /home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone/ref/cryptosporidium_hominis.fasta
bwa index /home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone/ref/cryptosporidium_hominis.fasta
bowtie2-build /home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone/ref/cryptosporidium_hominis.fasta cryptosporidium_hominis
picard CreateSequenceDictionary REFERENCE=/home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone/ref/cryptosporidium_hominis.fasta OUTPUT /home/most.dict

ERROR: picard CreateSequenceDictionary REFERENCE=/home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone/ref/cryptosporidium_hominis.fasta OUTPUT /home/most.dict FAILED!!!

STDERR: b'INFO\t2020-04-21 13:07:01\tCreateSequenceDictionary\t\n\n********** NOTE: Picard\'s command line syntax is changing.\n**********\n********** For more information, please see:\n********** https://github.com/broadinstitute/picard/wiki/Command-Line-Syntax-Transition-For-Users-(Pre-Transition)\n**********\n********** The command line looks like this in the new syntax:\n**********\n**********    CreateSequenceDictionary -REFERENCE /home/mostafa.abdel/aProjects/cryptosporidium/ebi-parasite/working_dir/simone/ref/cryptosporidium_hominis.fasta OUTPUT /home/most.dict\n**********\n\n\nERROR: Invalid argument \'OUTPUT\'.\n\nUSAGE: CreateSequenceDictionary [options]\n\nDocumentation: http://broadinstitute.github.io/picard/command-line-overview.html#CreateSequenceDictionary\n\nCreates a sequence dictionary for a reference sequence.  This tool creates a sequence dictionary file (with ".dict"\nextension) from a reference sequence provided in FASTA format, which is required by many processing and analysis tools.\nThe output file contains a header but no SAMRecords, and the header contains only sequence records.\n\nThe reference sequence can be gzipped (both .fasta and .fasta.gz are supported).\nUsage example:\n\njava -jar picard.jar CreateSequenceDictionary \\ \nR=reference.fasta \\ \nO=reference.dict\n\n\nVersion: 2.22.3\n\n\nOptions:\n\n--help\n-h                            Displays options specific to this tool.\n\n--stdhelp\n-H                            Displays options specific to this tool AND options common to all Picard command line\n                              tools.\n\n--version                     Displays program version.\n\nOUTPUT=File\nO=File                        Output SAM file containing only the sequence dictionary. By default it will use the base\n                              name of the input reference with the .dict extension  Default value: null. \n\nGENOME_ASSEMBLY=String\nAS=String                     Put into AS field of sequence dictionary entry if supplied  Default value: null. \n\nURI=String\nUR=String                     Put into UR field of sequence dictionary entry.  If not supplied, input reference file is\n                              used  Default value: null. \n\nSPECIES=String\nSP=String                     Put into SP field of sequence dictionary entry  Default value: null. \n\nTRUNCATE_NAMES_AT_WHITESPACE=Boolean\n                              Make sequence name the first word from the > line in the fasta file.  By default the\n                              entire contents of the > line is used, excluding leading and trailing whitespace.  Default\n                              value: true. This option can be set to \'null\' to clear the default value. Possible values:\n                              {true, false} \n\nNUM_SEQUENCES=Integer         Stop after writing this many sequences.  For testing.  Default value: 2147483647. This\n                              option can be set to \'null\' to clear the default value. \n\nALT_NAMES=File\nAN=File                       Optional file containing the alternative names for the contigs. Tools may use this\n                              information to consider different contig notations as identical (e.g: \'chr1\' and \'1\'). The\n                              alternative names will be put into the appropriate @AN annotation for each contig. No\n                              header. First column is the original name, the second column is an alternative name. One\n                              contig may have more than one alternative name.  Default value: null. \n\nREFERENCE=File\nR=File                        Input reference fasta or fasta.gz  Required. \n\n'
STDOUT: b''