Closed GreyGuoweiChen closed 2 years ago
I think there's an issue when downloading the assembly_summary.txt, either from an unstable connection of something with the NCBI servers that is causing this. It happened to me lately as well. I'm implementing a file checker to be released in the next version to hopefully avoid this issue, I'll update this thread when it is available.
For now an alternative is to manually download the bacterial refseq assembly_summary.txt, make sure it's complete and use as an external input in genome_updater:
./genome_updater.sh -e assembly_summary.txt -d "refseq" -c "representative genome" -f "protein.faa.gz" -o "test_refseq" -t 32 -m -k
There were some improvements implemented in the new version (v0.5.0) to solve this problem. Please give it a try and re-open this issue if the problem persists.
Hello @pirovc , When I download the bacteria proteins from refseq using genome_updater, the number of entries varies among every query. Do you have any idea why this happen? And I use comment like _genome_updater.sh -c "representative genome" -g "bacteria" -d "refseq" -f "protein.faa.gz" -o "testrefseq" -t 32 -m -k
And this occureed to me: 1:
2:
Any suggestions would help.