NCBI Taxonomy - Githubissues

healthyPlant / PhytoPipe

10 stars 1 forks source link

NCBI Taxonomy #1

Open sherlock0088 opened 1 year ago

sherlock0088 commented 1 year ago

Hi,

I tried to run this command with block size 1M from wiki: parallel -k --pipepart -a prot.accession2taxid --block 100M fgrep -F -f RVDB-prot.id.txt | cut -f 2,3 > RVDB-prot.taxonId.txt

Then I got a message" grep: memory exhausted

Just wondering if you have a solution for this issue. Our workstation only have 128G memory.

Best, Yupeng

xhu556 commented 1 year ago

The "parallel" is used to speed up the process. I think you can ignore parallel. But it will take a long time. Try this

fgrep prot.accession2taxid -F -f RVDB-prot.id.txt | cut -f 2,3 > RVDB-prot.taxonId.txt

If you still have the same problem, please let me know. I can write a script for you.

sherlock0088 commented 1 year ago

Hi,

Thanks for quick reply.

New code worked fine.

Here is one more issue. When I ran with Docker, I got some error message. Here I attach the log file. I have double checked the files mentioned in runDocker.sh are in the directories I give as input in the runDocker.sh.

docker.log runDocker.txt

Best, Yupeng

xhu556 commented 1 year ago

Please pull our new docker image "healthyplant/phytopipe" and follow the steps in Issue "Kaiju segmentation fault in Docker image" to test Kaiju. I'm afraid that's caused by your small computer memory (128G) .