Open michhulin opened 1 year ago
Hi, If you are running deeplasmid code natively on HPC, you will also need to install BBtools, prodigal, and download the models and hmm files from the NERSC directory. I provided a Docker image that contains all these, but perhaps you can't run Docker on HPC for security reasons. There are README instructions which describe what you would need to do if you rebuilt the Docker image, so the same applies if you want to run natively:
Please see the Supplementary Information from the publication for things to consider when building the Docker image: Prodigal and bbtools/sketch need to be built, and the model .h5 files from training are needed, as well as several sketch files and Pfam-A.TMP2.hmm that can be downloaded from https://portal.nersc.gov/dna/microbial/assembly/deeplasmid/ . https://github.com/wandreopoulos/deeplasmid/#troubleshooting-a-gpu-runLet me know if I can help.
On Fri, Feb 3, 2023 at 2:24 AM michhulin @.***> wrote:
Hi,
I am trying to run deeplasmid on the HPC at my workplace. The computer support team have installed the software. We have been trying it out and came across this error where it cannot find the file "yml" in the output folder. I was wondering if you could advise please?
Many thanks Michelle
TOTAL_RUNTIME: 310.58503294 exiting........ 2023-02-03 10:18:19.956907: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. oneHot base, size= 15 , sample: base: A 1-hot: [1.0, 0.0, 0.0, 0.0] base: C 1-hot: [0.0, 1.0, 0.0, 0.0] base: T 1-hot: [0.0, 0.0, 1.0, 0.0] base: G 1-hot: [0.0, 0.0, 0.0, 1.0] all bases : ['A', 'C', 'T', 'G', 'N', 'Y', 'K', 'M', 'V', 'S', 'H', 'R', 'W', 'B', 'D'] use seqLenCut= 300 deep-libs1 imported elaT=0.0 sec myArg: verb 1 myArg: dataPath /tsl/scratch/hulin/pseudomonas/analysis/plasmids/rfplasmid/deep/new/output/dlDataFormattedPred.20230203_101306 myArg: outPath /tsl/scratch/hulin/pseudomonas/analysis/plasmids/rfplasmid/deep/new/output/outPR.20230203_101306 myArg: noXterm True myArg: inputfasta /tsl/scratch/hulin/pseudomonas/analysis/plasmids/rfplasmid/deep/new/5244.fasta myArg: inputyml /tsl/scratch/hulin/pseudomonas/analysis/plasmids/rfplasmid/deep/new/output/dlFeatures.20230203_101306/yml myArg: arrIdx 1 myArg: kfoldOffset 0 disable Xterm Plotter_Plasmid : Graphics started DL_Model , prj: assayer4 globFeatureL 17 fixed order: ['gc_content', 'len_sequence', 'plassketch', 'plasORIsketch', 'chromsketch', 'genecount', 'genesperMB', 'aalenavg', 'pfam_vector', 'A_longestHomopol', 'A_totalLongHomopol', 'C_longestHomopol', 'C_totalLongHomopol', 'T_longestHomopol', 'T_totalLongHomopol', 'G_longestHomopol', 'G_totalLongHomopol'] Traceback (most recent call last): File "/srv/jgi-ml/classifier/dl/format_predict.py", line 72, in plasmGFD=get_glob_info_files(plasmGFDir) File "/srv/jgi-ml/classifier/dl/Util_Plasmid.py", line 62, in get_glob_info_files allL=os.listdir(dir0) FileNotFoundError: [Errno 2] No such file or directory: '/tsl/scratch/hulin/pseudomonas/analysis/plasmids/rfplasmid/deep/new/output/dlFeatures.20230203_101306/yml'
— Reply to this email directly, view it on GitHub https://github.com/wandreopoulos/deeplasmid/issues/6, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANGW5OQZSVOIXOO7IP6GNLWVTMGRANCNFSM6AAAAAAUQCZXKY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
-- Thanks, Bill
William B. Andreopoulos, Ph.D. Joint Genome Institute LBNL
Hi Bill,
Many thanks, we have got it working after downloading the additional files.
Best wishes Michelle
Hi,
I am trying to run deeplasmid on the HPC at my workplace. The computer support team have installed the software. We have been trying it out and came across this error where it cannot find the file "yml" in the output folder. I was wondering if you could advise please?
Many thanks Michelle
TOTAL_RUNTIME: 310.58503294 exiting........ 2023-02-03 10:18:19.956907: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. oneHot base, size= 15 , sample: base: A 1-hot: [1.0, 0.0, 0.0, 0.0] base: C 1-hot: [0.0, 1.0, 0.0, 0.0] base: T 1-hot: [0.0, 0.0, 1.0, 0.0] base: G 1-hot: [0.0, 0.0, 0.0, 1.0] all bases : ['A', 'C', 'T', 'G', 'N', 'Y', 'K', 'M', 'V', 'S', 'H', 'R', 'W', 'B', 'D'] use seqLenCut= 300 deep-libs1 imported elaT=0.0 sec myArg: verb 1 myArg: dataPath /tsl/scratch/hulin/pseudomonas/analysis/plasmids/rfplasmid/deep/new/output/dlDataFormattedPred.20230203_101306 myArg: outPath /tsl/scratch/hulin/pseudomonas/analysis/plasmids/rfplasmid/deep/new/output/outPR.20230203_101306 myArg: noXterm True myArg: inputfasta /tsl/scratch/hulin/pseudomonas/analysis/plasmids/rfplasmid/deep/new/5244.fasta myArg: inputyml /tsl/scratch/hulin/pseudomonas/analysis/plasmids/rfplasmid/deep/new/output/dlFeatures.20230203_101306/yml myArg: arrIdx 1 myArg: kfoldOffset 0 disable Xterm Plotter_Plasmid : Graphics started DL_Model , prj: assayer4 globFeatureL 17 fixed order: ['gc_content', 'len_sequence', 'plassketch', 'plasORIsketch', 'chromsketch', 'genecount', 'genesperMB', 'aalenavg', 'pfam_vector', 'A_longestHomopol', 'A_totalLongHomopol', 'C_longestHomopol', 'C_totalLongHomopol', 'T_longestHomopol', 'T_totalLongHomopol', 'G_longestHomopol', 'G_totalLongHomopol'] Traceback (most recent call last): File "/srv/jgi-ml/classifier/dl/format_predict.py", line 72, in
plasmGFD=get_glob_info_files(plasmGFDir)
File "/srv/jgi-ml/classifier/dl/Util_Plasmid.py", line 62, in get_glob_info_files
allL=os.listdir(dir0)
FileNotFoundError: [Errno 2] No such file or directory: '/tsl/scratch/hulin/pseudomonas/analysis/plasmids/rfplasmid/deep/new/output/dlFeatures.20230203_101306/yml'