weberlab-hhu / Helixer

Using Deep Learning to predict gene annotations
GNU General Public License v3.0
139 stars 20 forks source link

Some tensorflow warning/error messages when running Helixer via Singularity #123

Open spoonbender76 opened 2 months ago

spoonbender76 commented 2 months ago

Hi, I'm running Helixer v0.3.3 via Singularity v4.0.3 singularity pull docker://gglyptodon/helixer-docker:helixer_v0.3.3_cuda_11.8.0-cudnn8 singularity run --nv helixer-docker_helixer_v0.3.3_cuda_11.8.0-cudnn8.sif Helixer.py --fasta-path Nm.softmasked.fa --lineage invertebrate --gff-output-path Nm_helixer.gff3 --batch-size 8

Can I safely ignore these TensorFlow warnings/error messages, or might they affect performance/results?

setting self.n_seqs to 4932, bc that is len of data/X
2024-04-11 11:40:25.627235: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:432] Loaded cuDNN version 8906
2024-04-11 11:40:28.342978: I tensorflow/tsl/platform/default/subprocess.cc:304] Start cannot spawn child process: Permission denied
2024-04-11 11:40:28.344534: I tensorflow/tsl/platform/default/subprocess.cc:304] Start cannot spawn child process: Permission denied
2024-04-11 11:40:28.344551: W tensorflow/compiler/xla/stream_executor/gpu/asm_compiler.cc:109] Couldn't get ptxas version : FAILED_PRECONDITION: Couldn't get ptxas/nvlink version string: INTERNAL: Couldn't invoke ptxas --version
2024-04-11 11:40:28.346127: I tensorflow/tsl/platform/default/subprocess.cc:304] Start cannot spawn child process: Permission denied
2024-04-11 11:40:28.346192: W tensorflow/compiler/xla/stream_executor/gpu/redzone_allocator.cc:318] INTERNAL: Failed to launch ptxas
Relying on driver to perform ptx compilation. 
Modify $PATH to customize ptxas location.
This message will be only logged once.
haessar commented 2 months ago

I'm encountering similar when training a model with HybridModel.py using Apptainer v1.1.8 (rebranded Singularity) with the latest Docker container (helixer-docker:helixer_v0.3.3_cuda_11.8.0-cudnn8). Now worrying that my model training is running sub-optimally (i.e. slow), so would appreciate a response.

Jiangjiangzhang6 commented 1 month ago

I met the same error,.

I'm encountering similar when training a model with HybridModel.py using Apptainer v1.1.8 (rebranded Singularity) with the latest Docker container (helixer-docker:helixer_v0.3.3_cuda_11.8.0-cudnn8). Now worrying that my model training is running sub-optimally (i.e. slow), so would appreciate a response.

i met the same error , but i didnot have the root ,just to use the singularity,

alisandra commented 1 month ago

Hi, thanks for raising, will check out these errors more closely for the next release. I strongly suspect you can ignore them.

Helixer should run on the order of magnitude of 100mbp of genome/30min (or faster, hardware, batch size and gene density dependent). If it's much slower than that, then please let us know, that would be unexpectedly slow and might be running on the CPU instead of GPU.