wandreopoulos / deeplasmid

12 stars 2 forks source link

Error when running deeplasmid cannot find yml file #6

Open michhulin opened 1 year ago

michhulin commented 1 year ago

Hi,

I am trying to run deeplasmid on the HPC at my workplace. The computer support team have installed the software. We have been trying it out and came across this error where it cannot find the file "yml" in the output folder. I was wondering if you could advise please?

Many thanks Michelle

TOTAL_RUNTIME: 310.58503294 exiting........ 2023-02-03 10:18:19.956907: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. oneHot base, size= 15 , sample: base: A 1-hot: [1.0, 0.0, 0.0, 0.0] base: C 1-hot: [0.0, 1.0, 0.0, 0.0] base: T 1-hot: [0.0, 0.0, 1.0, 0.0] base: G 1-hot: [0.0, 0.0, 0.0, 1.0] all bases : ['A', 'C', 'T', 'G', 'N', 'Y', 'K', 'M', 'V', 'S', 'H', 'R', 'W', 'B', 'D'] use seqLenCut= 300 deep-libs1 imported elaT=0.0 sec myArg: verb 1 myArg: dataPath /tsl/scratch/hulin/pseudomonas/analysis/plasmids/rfplasmid/deep/new/output/dlDataFormattedPred.20230203_101306 myArg: outPath /tsl/scratch/hulin/pseudomonas/analysis/plasmids/rfplasmid/deep/new/output/outPR.20230203_101306 myArg: noXterm True myArg: inputfasta /tsl/scratch/hulin/pseudomonas/analysis/plasmids/rfplasmid/deep/new/5244.fasta myArg: inputyml /tsl/scratch/hulin/pseudomonas/analysis/plasmids/rfplasmid/deep/new/output/dlFeatures.20230203_101306/yml myArg: arrIdx 1 myArg: kfoldOffset 0 disable Xterm Plotter_Plasmid : Graphics started DL_Model , prj: assayer4 globFeatureL 17 fixed order: ['gc_content', 'len_sequence', 'plassketch', 'plasORIsketch', 'chromsketch', 'genecount', 'genesperMB', 'aalenavg', 'pfam_vector', 'A_longestHomopol', 'A_totalLongHomopol', 'C_longestHomopol', 'C_totalLongHomopol', 'T_longestHomopol', 'T_totalLongHomopol', 'G_longestHomopol', 'G_totalLongHomopol'] Traceback (most recent call last): File "/srv/jgi-ml/classifier/dl/format_predict.py", line 72, in plasmGFD=get_glob_info_files(plasmGFDir) File "/srv/jgi-ml/classifier/dl/Util_Plasmid.py", line 62, in get_glob_info_files allL=os.listdir(dir0) FileNotFoundError: [Errno 2] No such file or directory: '/tsl/scratch/hulin/pseudomonas/analysis/plasmids/rfplasmid/deep/new/output/dlFeatures.20230203_101306/yml'

wandreopoulos commented 1 year ago

Hi, If you are running deeplasmid code natively on HPC, you will also need to install BBtools, prodigal, and download the models and hmm files from the NERSC directory. I provided a Docker image that contains all these, but perhaps you can't run Docker on HPC for security reasons. There are README instructions which describe what you would need to do if you rebuilt the Docker image, so the same applies if you want to run natively:

Please see the Supplementary Information from the publication for things to consider when building the Docker image: Prodigal and bbtools/sketch need to be built, and the model .h5 files from training are needed, as well as several sketch files and Pfam-A.TMP2.hmm that can be downloaded from https://portal.nersc.gov/dna/microbial/assembly/deeplasmid/ . https://github.com/wandreopoulos/deeplasmid/#troubleshooting-a-gpu-runLet me know if I can help.

On Fri, Feb 3, 2023 at 2:24 AM michhulin @.***> wrote:

Hi,

I am trying to run deeplasmid on the HPC at my workplace. The computer support team have installed the software. We have been trying it out and came across this error where it cannot find the file "yml" in the output folder. I was wondering if you could advise please?

Many thanks Michelle

TOTAL_RUNTIME: 310.58503294 exiting........ 2023-02-03 10:18:19.956907: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. oneHot base, size= 15 , sample: base: A 1-hot: [1.0, 0.0, 0.0, 0.0] base: C 1-hot: [0.0, 1.0, 0.0, 0.0] base: T 1-hot: [0.0, 0.0, 1.0, 0.0] base: G 1-hot: [0.0, 0.0, 0.0, 1.0] all bases : ['A', 'C', 'T', 'G', 'N', 'Y', 'K', 'M', 'V', 'S', 'H', 'R', 'W', 'B', 'D'] use seqLenCut= 300 deep-libs1 imported elaT=0.0 sec myArg: verb 1 myArg: dataPath /tsl/scratch/hulin/pseudomonas/analysis/plasmids/rfplasmid/deep/new/output/dlDataFormattedPred.20230203_101306 myArg: outPath /tsl/scratch/hulin/pseudomonas/analysis/plasmids/rfplasmid/deep/new/output/outPR.20230203_101306 myArg: noXterm True myArg: inputfasta /tsl/scratch/hulin/pseudomonas/analysis/plasmids/rfplasmid/deep/new/5244.fasta myArg: inputyml /tsl/scratch/hulin/pseudomonas/analysis/plasmids/rfplasmid/deep/new/output/dlFeatures.20230203_101306/yml myArg: arrIdx 1 myArg: kfoldOffset 0 disable Xterm Plotter_Plasmid : Graphics started DL_Model , prj: assayer4 globFeatureL 17 fixed order: ['gc_content', 'len_sequence', 'plassketch', 'plasORIsketch', 'chromsketch', 'genecount', 'genesperMB', 'aalenavg', 'pfam_vector', 'A_longestHomopol', 'A_totalLongHomopol', 'C_longestHomopol', 'C_totalLongHomopol', 'T_longestHomopol', 'T_totalLongHomopol', 'G_longestHomopol', 'G_totalLongHomopol'] Traceback (most recent call last): File "/srv/jgi-ml/classifier/dl/format_predict.py", line 72, in plasmGFD=get_glob_info_files(plasmGFDir) File "/srv/jgi-ml/classifier/dl/Util_Plasmid.py", line 62, in get_glob_info_files allL=os.listdir(dir0) FileNotFoundError: [Errno 2] No such file or directory: '/tsl/scratch/hulin/pseudomonas/analysis/plasmids/rfplasmid/deep/new/output/dlFeatures.20230203_101306/yml'

— Reply to this email directly, view it on GitHub https://github.com/wandreopoulos/deeplasmid/issues/6, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANGW5OQZSVOIXOO7IP6GNLWVTMGRANCNFSM6AAAAAAUQCZXKY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Thanks, Bill


William B. Andreopoulos, Ph.D. Joint Genome Institute LBNL

michhulin commented 1 year ago

Hi Bill,

Many thanks, we have got it working after downloading the additional files.

Best wishes Michelle