ncbi / pgap

NCBI Prokaryotic Genome Annotation Pipeline
Other
294 stars 89 forks source link

pgap --update showing huge file size during installation #292

Closed Sabarish2001 closed 4 months ago

Sabarish2001 commented 4 months ago

Hello, I installed PGAP few days ago, and i installed it through the link wget -O pgap.py https://github.com/ncbi/pgap/raw/prod/scripts/pgap.py . I then updated it with the command ./pgap --update as instructed. During the update it was displaying a message of something like this and the file size is also huge. i did follow NCBI webinar on the installation of PGAP but they have mentioned it will install in a couple of minutes. I just wanted to know if i have messed up with the docker installation.

Following is the message displayed while running ./pgap.py --update

(Annotation) mol-17@mol17-ThinkCentre-M70t-Gen-3:~/PGAP$ ./pgap.py --update

The latest version of PGAP is 2023-10-03.build7061, you have nothing installed locally. installation directory: /home/mol-17/.pgap Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda5 282137184 47925812 219806560 18% / Downloading (as needed) Docker image ncbi/pgap:2023-10-03.build7061 Downloading and extracting tarball: https://s3.amazonaws.com/pgap/input-2023-10-03.build7061.tgz permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/images/create?fromImage=ncbi%2Fpgap&tag=2023-10-03.build7061": dial unix /var/run/docker.sock: connect: permission denied None ^Zwnloaded 125327872 of 18345810574 bytes (0.68%)

Can somebody shed some light to my question ?

Thank you in advance

azat-badretdin commented 4 months ago

Thank you, user @Sabarish2001 for your post!

It seems that the problem here is in some problems during docker image installation. You might want to contact your local Docker experts to figure out what to do here. The command used inside is docker pull ncbi/pgap:2023-10-03.build7061

I would start running this command yourself and see what happens

Sabarish2001 commented 4 months ago

Thank you for your quick response @azat-badretdin . So you are asking me to start running the command "docker pull ncbi/pgap:2023-10-03.build7061" right? So in this case all i need is to have the docker installed on my system right and I don't need to install it by following the step by step instructions provided by NCBI on GitHub right?

I am asking this because there is no local docker experts here. So the alternative you have provided me is the one I need to try right?

azat-badretdin commented 4 months ago

I don't need to install it by following the step by step instructions provided by NCBI on GitHub right?

Besides Docker image, running ./pgap.py installation with installation command line parameters include installation of reference data and test genomes that must succeed. I have hopes that by pulling docker image as a separate command you might resolve a particular issue with Docker that you have and installation will detect that the image have been already loaded.

How is docker pull going?

Sabarish2001 commented 4 months ago

Hello @azat-badretdin , after running the pgap.py --update as instructed this was the following message i got

The latest version of PGAP is 2023-10-03.build7061, you have nothing installed locally. installation directory: /home/mol-17/.pgap Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda5 282137184 47923068 219809304 18% / Downloading (as needed) Docker image ncbi/pgap:2023-10-03.build7061 Downloading and extracting tarball: https://s3.amazonaws.com/pgap/input-2023-10-03.build7061.tgz permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/images/create?fromImage=ncbi%2Fpgap&tag=2023-10-03.build7061": dial unix /var/run/docker.sock: connect: permission denied None Traceback (most recent call last):574 bytes (100.00%) File "./pgap.py", line 994, in main params = Setup(args) File "./pgap.py", line 558, in init self.update() File "./pgap.py", line 718, in update raise Exception(f'installation of some or all of components failed. Please remove {self.data_path}, {self.install_dir}/test_genomes, {self.test_genomes_path} directories and try again.') Exception: installation of some or all of components failed. Please remove /home/mol-17/.pgap/input-2023-10-03.build7061, /home/mol-17/.pgap/test_genomes, /home/mol-17/.pgap/test_genomes-2023-10-03.build7061 directories and try again.

Can you tell me what would have went wrong?

Sabarish2001 commented 4 months ago

Hello @azat-badretdin ,

i also ran "docker pull ncbi/pgap:2023-10-03.build7061" as you suggested me

023-10-03.build7061: Pulling from ncbi/pgap 2d473b07cdd5: Pulling fs layer b52a90f9a9d2: Pulling fs layer 40f5ebbf5b49: Pulling fs layer 98da8a1cb22f: Waiting f94a9e1dd98b: Waiting 07a1d0cd53da: Waiting 3460df8d03d7: Waiting 0f33e9a93715: Waiting b640fa92d912: Waiting 4d10ec096652: Pull complete 2b50222ab7bc: Pull complete 85197a636de2: Pull complete a977499672c4: Pull complete 895822ead3b4: Pull complete cb62709a50dd: Pull complete a72005e4c7d8: Pull complete 3055c2ec87e3: Pull complete 6bc0954fb4da: Pull complete 9bab7de95c36: Pull complete b7f8ccab9440: Pull complete 29b91ff48d19: Pull complete a01881da90ca: Pull complete 88fc13c9b338: Pull complete 31e23311893b: Pull complete 2c183caed7c0: Pull complete d306e4893fc8: Pull complete 58f15f4e1e16: Pull complete b483c0565f85: Pull complete e45db5f3afd1: Pull complete d5c4cc2fbb2d: Pull complete 33cf0f1b3356: Pull complete 6993cb74e94b: Pull complete e2b9ca408e38: Pull complete 8564960125ac: Pull complete 0c97a3d4ca2d: Pull complete ad47c460ff02: Pull complete 66c44f46beea: Pull complete 1d68326148ef: Pull complete 9f69ab609e33: Pull complete ab3f1a0e1a97: Pull complete f3557917c12c: Pull complete 5512f0d9d46d: Pull complete 4e698afa5583: Pull complete b3adf85cd10f: Pull complete fb7703466412: Pull complete 3ec756e456ea: Pull complete b1112c3b877f: Pull complete e75d1456f81f: Pull complete a83809bade57: Pull complete 7a72510f3fa6: Pull complete 104cad1e49f6: Pull complete 3bd4c5292644: Pull complete d9657e71219e: Pull complete Digest: sha256:cfb81df863fe5423226b299e42c94b9420959a0cabacfa15a71f01acde8444f9 Status: Downloaded newer image for ncbi/pgap:2023-10-03.build7061 docker.io/ncbi/pgap:2023-10-03.build7061

and i guess the pull seems to be successfull. As i am new to docker , i dont actually no how to use it. I got the list of docker images using the command "docker image ls" and following is the message displayed :

REPOSITORY TAG IMAGE ID CREATED SIZE ncbi/pgap 2023-10-03.build7061 b86031082acb 5 months ago 13GB

Now in order to use the PGAP for annotation, how do i further proceed?

azat-badretdin commented 4 months ago

after running the pgap.py --update as instructed this was the following message i got

Did you run this after you ran docker pull .... command?

For now it looks like you posted the output before you ran docker pull manually. Otherwise, you would have seen a message:

Status: Image is up to date for ncbi/pgap:2023-10-03.build7061

Could you please make sure that you ran ./pgap.py --update after you ran docker pull ...?

(You do not need to rerun docker pull ... again, maybe only for curiosity purpose: to see that it informs you that the image have been already downloaded)

Sabarish2001 commented 4 months ago

Hi @azat-badretdin , i ran ./pgap.py --update yesterday and as the file size was showing around 18GB i let it install the neccessary files and today when i opened up the following message was displayed as you can see above. Later , as you suggested me i ran docker pull ncbi/pgap:2023-10-03.build7061 and it seems the download of files was successfull as you can see in the later one. After this i checked the images using "docker images" command and it listed the ncbi/pgap repository. So what i did in the next step was "docker run ncbi/pgap" and there was a error thrown , i came across the error in stackover flow and then i tried again "docker run ncbi/pgap:2023-10-03.build7061" and nothing happened. When i tried it with "docker run IMAGE id" it displayed two things 1) input 2) pgap.

I checked pgap and there were list of files in pgap directory. But i am sure i am messed up somewhere

azat-badretdin commented 4 months ago

So what i did in the next step was "docker run ncbi/pgap" and there was a error thrown , i came across the error in stackover flow and then i tried again "docker run ncbi/pgap:2023-10-03.build7061" and nothing happened. When i tried it with "docker run IMAGE id" it displayed two things 1) input 2) pgap.

This is understandable. You are experimenting with Docker. But for the sake of checking what is going on with pgap.py - could you please run ./pgap.py --update again and post what happens?

Sabarish2001 commented 4 months ago

I am trying it again in my personal system. Can you tell me the following steps that i mention is the right way to install PGAP?

1) I need to install docker . I am doing it with

sudo apt-get install docker.io

2) I install the pgap.py using

$ wget https://github.com/ncbi/pgap/raw/prod/scripts/pgap.py

3) I have to change permissions using

chmod +x ./pgap.py

4) I have to update the ./pgap.py file using

./pgap.py --update

At last the resulting files are the successfull download of pgap right?

Sabarish2001 commented 4 months ago

This is understandable. You are experimenting with Docker. But for the sake of checking what is going on with pgap.py - could you please run ./pgap.py --update again and post what happens?

Yeah sure. I need time as i was working in the desktop from my university. However i am giving a try in my personal laptop one more time to check if everything goes right. I will let you know

azat-badretdin commented 4 months ago

Can you tell me the following steps that i mention is the right way to install PGAP?

I can't comment on Docker installation.

The rest is correct.

If the Docker behaves for you in the way described in Description when running pgap.py --update, please insert docker pull .... step BEFORE ./pgap.py --update step

Sabarish2001 commented 4 months ago

This is understandable. You are experimenting with Docker. But for the sake of checking what is going on with pgap.py - could you please run ./pgap.py --update again and post what happens?

Hello @azat-badretdin , i ran ./pgap.py --update and i got the following

(base) sabioinfo@LAPTOP-VC2C69TH:~/pgap$ ./pgap.py --update The latest version of PGAP is 2023-10-03.build7061, you have nothing installed locally. installation directory: /home/sabioinfo/.pgap Filesystem 1K-blocks Used Available Use% Mounted on /dev/sdc 1055762868 30661384 971398012 4% / Downloading (as needed) Docker image ncbi/pgap:2023-10-03.build7061 Downloading and extracting tarball: https://s3.amazonaws.com/pgap/input-2023-10-03.build7061.tgz 2023-10-03.build7061: Pulling from ncbi/pgap%) 2d473b07cdd5: Already exists b52a90f9a9d2: Already exists 40f5ebbf5b49: Already exists 98da8a1cb22f: Already exists f94a9e1dd98b: Already exists 07a1d0cd53da: Already exists 3460df8d03d7: Already exists 0f33e9a93715: Already exists b640fa92d912: Already exists 4d10ec096652: Already exists 2b50222ab7bc: Pulling fs layer 85197a636de2: Pulling fs layer a977499672c4: Pulling fs layer 895822ead3b4: Pulling fs layer cb62709a50dd: Pulling fs layer a72005e4c7d8: Pulling fs layer 3055c2ec87e3: Pulling fs layer 6bc0954fb4da: Pulling fs layer 9bab7de95c36: Pulling fs layer b7f8ccab9440: Pulling fs layer 29b91ff48d19: Pulling fs layer a01881da90ca: Pulling fs layer 88fc13c9b338: Pulling fs layer 31e23311893b: Pulling fs layer 88fc13c9b338: Pull complete 31e23311893b: Pull complete 2c183caed7c0: Pull complete d306e4893fc8: Pull complete 58f15f4e1e16: Pull complete b483c0565f85: Pull complete e45db5f3afd1: Pull complete d5c4cc2fbb2d: Pull complete 33cf0f1b3356: Pull complete 6993cb74e94b: Pull complete e2b9ca408e38: Pull complete 8564960125ac: Pull complete 0c97a3d4ca2d: Pull complete ad47c460ff02: Pull complete 66c44f46beea: Pull complete 1d68326148ef: Pull complete 9f69ab609e33: Pull complete ab3f1a0e1a97: Pull complete f3557917c12c: Pull complete 5512f0d9d46d: Pull complete 4e698afa5583: Pull complete b3adf85cd10f: Pull complete fb7703466412: Pull complete 3ec756e456ea: Pull complete b1112c3b877f: Pull complete e75d1456f81f: Pull complete a83809bade57: Pull complete 7a72510f3fa6: Pull complete 104cad1e49f6: Pull complete 3bd4c5292644: Pull complete d9657e71219e: Pull complete Digest: sha256:cfb81df863fe5423226b299e42c94b9420959a0cabacfa15a71f01acde8444f9 Status: Downloaded newer image for ncbi/pgap:2023-10-03.build7061 docker.io/ncbi/pgap:2023-10-03.build7061es (14.10%) (base) sabioinfo@LAPTOP-VC2C69TH:~/pgap$ es (100.00%)

Then, i ran docker run ncbi/pgap and an error message was thrown like this:

Unable to find image 'ncbi/pgap:latest' locally docker: Error response from daemon: manifest for ncbi/pgap:latest not found: manifest unknown: manifest unknown.

How do i solve this issue?

Sabarish2001 commented 4 months ago

@azat-badretdin , i apologize for the straight lines that in case if you are seeing on the message. I dont have any idea why this is happening .

azat-badretdin commented 4 months ago

I do not see problems in your installation. Could you please try to run an actual genomic example now from Quick Start?

Sabarish2001 commented 4 months ago

I do not see problems in your installation. Could you please try to run an actual genomic example now from Quick Start?

Hello @azat-badretdin , apologies for late response.

Sure, i will run an example from the quick start and i will update you .

Sabarish2001 commented 4 months ago

Hi @azat-badretdin , i ran the Mycoplasmoides genitalium example provided with the installation, it ran successfully and the output files have successfully generated. Thank you so much for you assistance.

However, i am requesting you for sometime not to close this issue as i havent tested it on my actual genomic data. Could you keep it open as i will update you once i test it on actual genomic data? So that if any issues arise when i test it on my actual genomic data, i can comeback here.

azat-badretdin commented 4 months ago

Could you keep it open

Sure

Sabarish2001 commented 4 months ago

Hello @azat-badretdin , i ran my original data generated from the assembly (assembly.fasta) file for annotation and it produced the following error.

This is the command with which i ran my file with :

./pgap.py -r -o ../mscproject/ass_1kb_ann_res/ -g ../mscproject/assembly_1kb/assembly.fasta -s 'Lactobacillus crispatus'

Traceback (most recent call last): File "/home/sabioinfo/miniconda3/lib/python3.11/urllib/request.py", line 1348, in do_open h.request(req.get_method(), req.selector, req.data, headers, File "/home/sabioinfo/miniconda3/lib/python3.11/http/client.py", line 1286, in request self._send_request(method, url, body, headers, encode_chunked) File "/home/sabioinfo/miniconda3/lib/python3.11/http/client.py", line 1332, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/home/sabioinfo/miniconda3/lib/python3.11/http/client.py", line 1281, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/home/sabioinfo/miniconda3/lib/python3.11/http/client.py", line 1041, in _send_output self.send(msg) File "/home/sabioinfo/miniconda3/lib/python3.11/http/client.py", line 979, in send self.connect() File "/home/sabioinfo/miniconda3/lib/python3.11/http/client.py", line 1451, in connect super().connect() File "/home/sabioinfo/miniconda3/lib/python3.11/http/client.py", line 945, in connect self.sock = self._create_connection( ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sabioinfo/miniconda3/lib/python3.11/socket.py", line 827, in create_connection for res in getaddrinfo(host, port, 0, SOCK_STREAM): ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sabioinfo/miniconda3/lib/python3.11/socket.py", line 962, in getaddrinfo for res in _socket.getaddrinfo(host, port, family, type, proto, flags): ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ socket.gaierror: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/sabioinfo/pgap/./pgap.py", line 994, in main params = Setup(args) ^^^^^^^^^^^ File "/home/sabioinfo/pgap/./pgap.py", line 532, in init self.remote_versions = self.get_remote_versions() ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sabioinfo/pgap/./pgap.py", line 606, in get_remote_versions response = urlopen('https://api.github.com/repos/ncbi/pgap/releases/latest') ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sabioinfo/miniconda3/lib/python3.11/urllib/request.py", line 216, in urlopen return opener.open(url, data, timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sabioinfo/miniconda3/lib/python3.11/urllib/request.py", line 519, in open response = self._open(req, data) ^^^^^^^^^^^^^^^^^^^^^ File "/home/sabioinfo/miniconda3/lib/python3.11/urllib/request.py", line 536, in _open result = self._call_chain(self.handle_open, protocol, protocol + ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sabioinfo/miniconda3/lib/python3.11/urllib/request.py", line 496, in _call_chain result = func(*args) ^^^^^^^^^^^ File "/home/sabioinfo/miniconda3/lib/python3.11/urllib/request.py", line 1391, in https_open return self.do_open(http.client.HTTPSConnection, req, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sabioinfo/miniconda3/lib/python3.11/urllib/request.py", line 1351, in do_open raise URLError(err) urllib.error.URLError: <urlopen error [Errno -3] Temporary failure in name resolution>

May i know what can be the possible issue?

azat-badretdin commented 4 months ago

This looks like internet connection error. Have you tried to rerun the command at another time?