Closed vdruelle closed 2 months ago
Hi @vdruelle ,
The error seems to be caused by the warning lines:
2024-07-02 14:40:08.983 | WARNING | phold.features.predict_3Di:get_embeddings:493 - BUDPYTUS_CDS_0058 MZ501093.1 prediction has length 0
2024-07-02 14:43:19.564 | WARNING | phold.features.predict_3Di:get_embeddings:493 - BUDPYTUS_CDS_0166 MZ501093.1 prediction has length 0
2024-07-02 14:43:21.652 | WARNING | phold.features.predict_3Di:get_embeddings:493 - BUDPYTUS_CDS_0174 MZ501093.1 prediction has length 0
2024-07-02 14:43:28.210 | WARNING | phold.features.predict_3Di:get_embeddings:493 - BUDPYTUS_CDS_0192 MZ501093.1 prediction has length 0
2024-07-02 14:43:31.288 | WARNING | phold.features.predict_3Di:get_embeddings:493 - BUDPYTUS_CDS_0200 MZ501093.1 prediction has length 0
2024-07-02 14:43:36.685 | WARNING | phold.features.predict_3Di:get_embeddings:493 - BUDPYTUS_CDS_0215 MZ501093.1 prediction has length 0
2024-07-02 14:43:39.533 | WARNING | phold.features.predict_3Di:get_embeddings:493 - BUDPYTUS_CDS_0221 MZ501093.1 prediction has length 0
which I would assume then causes the 0-d array issue.
When Phold tries to write the 3Di sequences, it will iterate over an empty array for these proteins, which clearly errors out.
I'll put in some fix for the next version.
To practically solve your error (as it'll take a while before the next release and in any case you'll miss potential annotations for these proteins!), the embedding failure that is the root cause of this error is probably caused by your hardware. Therefore, I'd recommend:
George
Hi @gbouras13,
Thanks a lot for the answer and suggestions to fix this problem. I'll give it a try in the following days.
Have a great day ! Valentin
Description
I'm trying to generate annotation for a couple of phages from the BASEL collection (like this one from NCBI https://www.ncbi.nlm.nih.gov/nuccore/2071745857) to test the performance of the tool. I am first generating a genbank file using Pharokka, which seems to be fine since the tool completes the job and I obtain the pharokka.gbk file in the output folder.
I then try to use phold on this file with the command:
phold run -i output/EM60_pharokka/pharokka.gbk -o output/EM60_phold -t 8 -f --cpu
I'm running the cpu version since my GPU doesn't have enough memory for the gpu version.The tool starts but eventually fails at the ProstT5 prediction step. I'm copy pasting the output of the terminal below. I tried figuring out what was the problem but it was unconclusive.
Do you have an idea where the issue comes from ? Thanks and have a great day !