Closed Vijithkumar2020 closed 1 month ago
Hi @Vijithkumar2020,
Sorry for the slow reply.
From the directory where you ran Docker, can you check whether the $PWD/interproscan-5.69-101.0/data/pirsr/2023_05
does exist?
Thank you for your response. Yes, $PWD/interproscan-5.69-101.0/data/pirsr/2023_05 does exist. However, the process runs painfully slower. The machine is a local server with 256GB memory, and 8 core cpu. Here is the hitherto run report. Looking forward to a response. https://docs.google.com/document/d/1RH8SHBXB1UJ-b6zu6ePkLnFDcf4DcUsTRYVZQKsXBvU/edit?usp=sharing
On Tue, Oct 1, 2024 at 3:30 PM Matthias Blum @.***> wrote:
Hi @Vijithkumar2020 https://github.com/Vijithkumar2020,
Sorry for the slow reply. From the directory where you ran Docker, can you check whether the $PWD/interproscan-5.69-101.0/data/pirsr/2023_05 does exist?
— Reply to this email directly, view it on GitHub https://github.com/ebi-pf-team/interproscan/issues/379#issuecomment-2385353769, or unsubscribe https://github.com/notifications/unsubscribe-auth/APVJDNZDUBDVP3CC3QVZ34LZZJXE3AVCNFSM6AAAAABO6MWPAWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOBVGM2TGNZWHE . You are receiving this because you were mentioned.Message ID: @.***>
The logs you added to Google Docs do not contain the same error (missing file). The error seems to be related to the match lookup service. You can turn off the match lookup with the -dp
option.
I also noted the following line:
27/09/2024 09:12:04:929 Uploaded 505641 unique sequences for analysis
This is a large number of sequences. InterProScan5 isn't very efficient with large input files. I'd suggest splitting in 10-15 smaller files.
Thank you for your valuable suggestion. I have started re-running, with the match lookup turned off. Here is the updated docker command. Kindly let me know if this is good to go. The smaller files contain 50k sequences. Looking forward to a response from you.
sudo docker run --rm \ -v $PWD/interproscan-5.69-101.0/data:/opt/interproscan/data \ -v $PWD/input:/input \ -v $PWD/temp:/temp \ -v $PWD/output:/output \ interpro/interproscan:5.69-101.0 \ --input /input/out_1.fasta \ --output-dir /output \ --tempdir /temp \ --cpu 8 \ --formats tsv,xml \ -dp
Regards
On Tue, Oct 1, 2024 at 4:41 PM Matthias Blum @.***> wrote:
The logs you added to Google Docs do not contain the same error (missing file). The error seems to be related to the match lookup service. You can turn off the match lookup with the -dp option.
I also noted the following line:
27/09/2024 09:12:04:929 Uploaded 505641 unique sequences for analysis
This is a large number of sequences. InterProScan5 isn't very efficient with large input files. I'd suggest splitting in 10-15 smaller files.
— Reply to this email directly, view it on GitHub https://github.com/ebi-pf-team/interproscan/issues/379#issuecomment-2385494838, or unsubscribe https://github.com/notifications/unsubscribe-auth/APVJDN33VOVDX4PYGHSRJ5LZZJ7NPAVCNFSM6AAAAABO6MWPAWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOBVGQ4TIOBTHA . You are receiving this because you were mentioned.Message ID: @.***>
This looks OK. Have you had troubles running the command you posted?
I was running InterProscan (v.5.69-101.0), using the test e-coli file. But the run was terminated prematurely, rending the following run report. I am not able to understand this. Can you help me resolve this issue?