Closed wdlingit closed 3 years ago
It is OK for me to manually submit files to Argot2.5 and put result files to some appropriate places for the final step of GOMAP. Please kindly let me know how to do it if this is possible.
Hey @wdlingit,
It should be fine doing the Argot2 step manually, but the server sometimes is unavailable. It might be worth trying again before running things manually.
I will try to update the documentation with the manual steps if Argot2.5 keeps failing.
Thanks
Best Kokul
By using my data and running test.sh, I had tried about 5 times (including one on a fresh new ubuntu16 VM). All failed at the same step. However the Argot2.5 seems usually available by using my desktop browser. Is it possible that the Argot2.5 server made some changes causing this?
I am running the test again now. It had worked day before yesterday. I will update once it has completed. If it fails, then I will update with manual running instructions.
test
directory and run git checkout test
test/config.yaml
file and re-run test.sh
. I just ran the test and it went through with no issues on my machine. It's a Windows machine with WSL2. I have also tried from the cluster and it does perform well.
The manual upload of Argot2.5 would be as follows for the test.
The files are located at test/GOMAP-0.3_GOMAP-input/tmp/mixed-meth/argot2.5
The blast
directory contains
blast/0.3_GOMAP-input.1.tsv.zip
blast/0.3_GOMAP-input.2.tsv.zip
The hmmer
directory contains
hmmer/0.3_GOMAP-input.1.hmm.out.zip
hmmer/0.3_GOMAP-input.2.hmm.out.zip
You can upload the zip files in a pairwise manner to Argot2 web server and download the results. The zip files downloaded should be as follows.
results/0.3_GOMAP-input.1.tsv.zip
results/0.3_GOMAP-input.2.tsv.zip
As long as these files are there GOMAP should complete without issue.
Thank you for the instruction. Sorry that I spent some time on other works. I retried the test.sh in a new ubuntu18 VM with singularity 3.5.2. git checkout v1.3.5, ./setup.sh, modified test/config.yml to be with my email, and ./test.sh.
For this time, got the same error message
Completed Running mixmeth-preproc step
Running mixed-method based annotations
Submitting 0.3_GOMAP-input.2.tsv.zip and 0.3_GOMAP-input.2.hmm.out.zip to Argot2.5
Submitting 0.3_GOMAP-input.1.tsv.zip and 0.3_GOMAP-input.1.hmm.out.zip to Argot2.5
Traceback (most recent call last):
File "./gomap.py", line 93, in <module>
run_mixmeth(config)
File "/opt/GOMAP/code/gomap_mixmeth.py", line 31, in run_mixmeth
submit_argot2(config)
File "/opt/GOMAP/code/pipeline/run_argot2.py", line 176, in submit_argot2
r_insert = s.post(argot_url,data=payload,files=files,headers=headers)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 559, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 512, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 622, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 495, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', error(104, 'Connection reset by peer'))
BUT got "Job submitted" email from argot2.5 server and then "Job COMPLETED!" email about a few minutes later. So I used "./test.sh aggregate" for the final test step and it seems OK. To be safe, I copied test/0.3_GOMAP-input.fa to the GOMAP directory for another test by treating the test fasta as an usual input.
ubuntu@singularity:~/GOMAP1$ cp test/0.3_GOMAP-input.fa .
ubuntu@singularity:~/GOMAP1$ cat min-config.yml
#Input section
input:
#input fasta file name
fasta: 0.3_GOMAP-input.fa
# output file basename
basename: test2
#input NCBI taxonomy id
taxon: "4577"
# Name of the species
species: "Zea mays"
# Email is mandatory
email: my@email.addr
#Number of CPUs used for tools
cpus: 4
#Whether openmpi should be used
mpi: False
#what the name of the temporary directory is
tmpdir: "/tmpdir"
(this perl onelineer is to generate commands of GOMAP steps: seqsim domain fanngo mixmeth-blast mixmeth-preproc mixmeth)
ubuntu@singularity:~/GOMAP1$ echo "seqsim domain fanngo mixmeth-blast mixmeth-preproc mixmeth" | perl -ne 'if($.==1){ $msg=`ls min-config.yml`; chomp $msg; @files=split(/\s+/,$msg) } chomp; @t=split; for $x (@t){ for $f (@files){ $f=~/_(\d+)\./; $cmd="./run-GOMAP-SINGLE.sh --step=$x --config=$f"; print "\nCMD: $cmd\n"; system $cmd } }'
One day later, I got the same error message and NO notifications from argot2.5 server.
CMD: ./run-GOMAP-SINGLE.sh --step=seqsim --config=min-config.yml
(seems OK, log omitted)
CMD: ./run-GOMAP-SINGLE.sh --step=domain --config=min-config.yml
(seems OK, log omitted)
CMD: ./run-GOMAP-SINGLE.sh --step=fanngo --config=min-config.yml
(seems OK, log omitted)
CMD: ./run-GOMAP-SINGLE.sh --step=mixmeth-blast --config=min-config.yml
(seems OK, log omitted)
CMD: ./run-GOMAP-SINGLE.sh --step=mixmeth-preproc --config=min-config.yml
(seems OK, log omitted)
CMD: ./run-GOMAP-SINGLE.sh --step=mixmeth --config=min-config.yml
/tmp:/tmp,/home/ubuntu/GOMAP1:/workdir,/home/ubuntu/GOMAP1/tmp:/tmpdir,
Running GOMAP --step=mixmeth --config=min-config.yml
Running mixed-method based annotations
Submitting test2.1.tsv.zip and test2.1.hmm.out.zip to Argot2.5
Traceback (most recent call last):
File "./gomap.py", line 93, in <module>
run_mixmeth(config)
File "/opt/GOMAP/code/gomap_mixmeth.py", line 31, in run_mixmeth
submit_argot2(config)
File "/opt/GOMAP/code/pipeline/run_argot2.py", line 176, in submit_argot2
r_insert = s.post(argot_url,data=payload,files=files,headers=headers)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 559, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 512, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 622, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 495, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', error(104, 'Connection reset by peer'))
So I manually upload test2.1.tsv.zip and test2.1.hmm.out.zip to argot2.5 and download the result file "result.zip". Placed result.zip in argot2.5/results directory and renamed it as test2.1.tsv.zip. The aggregate step got some error message
ubuntu@singularity:~/GOMAP1$ ./run-GOMAP-SINGLE.sh --step=aggregate --config=min-config.yml
/tmp:/tmp,/home/ubuntu/GOMAP1:/workdir,/home/ubuntu/GOMAP1/tmp:/tmpdir,
Running GOMAP --step=aggregate --config=min-config.yml
Running Aggregate Step
[1] "Reading the input file"
[1] "Converting to GAF 2.0"
[1] "Checking if data/data/go/go.obo.data exists"
[1] "data/data/go/go.obo.data exists so loading R object"
user system elapsed
3.903 0.386 4.300
[1] "Writing the outfile"
[1] "Reading the input file"
[1] "Converting to GAF 2.0"
Error in .(QueryId, GO_class, Score) : could not find function "."
Calls: pannzer2gaf
Execution halted
Traceback (most recent call last):
File "./gomap.py", line 103, in <module>
aggregate(config)
File "/opt/GOMAP/code/gomap_aggregate.py", line 29, in aggregate
mixed2gaf(config)
File "/opt/GOMAP/code/pipeline/mixed2gaf.py", line 8, in mixed2gaf
check_output_and_run("test.pod",command)
File "/opt/GOMAP/code/utils/basic_utils.py", line 24, in check_output_and_run
subprocess.check_call(command,stdin=stdin_file,stdout=stdout_file)
File "/usr/lib/python2.7/subprocess.py", line 190, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['Rscript', 'code/pipeline/mixed2gaf.r', '/workdir/GOMAP-test2/test2.all.yml']' returned non-zero exit status 1
ubuntu@singularity:~/GOMAP1$ cat GOMAP-test2/logs/test2-aggregate.log
INFO [2021-10-17 04:24] Starting to run the pipline for test2
INFO [2021-10-17 04:24] Obtaining and aggregating Argot2.5 results
INFO [2021-10-17 04:24] Unzipping test2.1.tsv.zip
INFO [2021-10-17 04:24] Filtering mixed-method GAF
INFO [2021-10-17 04:24] test.pod not present so running command
Rscript code/pipeline/mixed2gaf.r /workdir/GOMAP-test2/test2.all.yml
Did some file list checking between the two tests. test1: made by test.sh , test2: made with the test fasta as an usual input.
(file list diff finding)
ubuntu@singularity:~/GOMAP1$ find GOMAP-test2/ | sort | perl -ne 'chomp; /.+?\/(.+)$/; print "$1\n"' > test2.filelist.1
ubuntu@singularity:~/GOMAP1$ find test/GOMAP-0.3_GOMAP-input/ | sort | perl -ne 'chomp; /.+?\/.+?\/(.+)$/; print "$1\n"' > test1.filelist.1
test1 got files in gaf/a.mm_gaf/ gaf/b.raw_gaf/ gaf/c.uniq_gaf/ gaf/d.non_red_gaf/ gaf/e.agg_data/, and test2 got nothing for gaf/c.uniq_gaf/ gaf/d.non_red_gaf/ gaf/e.agg_data/. Also, no files in tmp/mixed-meth/pannzer/results/ for both test1 and test2. I checked this because the last message before halted is "Calls: pannzer2gaf". Could this be related? Or I just wronly did some step? Thank you for your patience.
Hey @wdlingit,
This is puzzling to me. The error with argot2.5 shown below indicates that Argot server is declining the connection for some reason. You have solved it with manual upload and that should take care of that issue.
requests.exceptions.ConnectionError: ('Connection aborted.', error(104, 'Connection reset by peer'))
Seems like there is a connection issue from singularity and Argot2.5, and I am not sure how.
The log file at GOMAP-singularity/test/GOMAP-0.3_GOMAP-input/logs/0.3_GOMAP-input-mixmeth.log
should say te following.
INFO [2021-10-17 13:02] Starting to run the pipline for 0.3_GOMAP-input
INFO [2021-10-17 13:02] Running mixed-method based annotations
INFO [2021-10-17 13:02] Submitting the batch inputs to Argot2
INFO [2021-10-17 13:02] Submitting 0.3_GOMAP-input.2.tsv.zip and 0.3_GOMAP-input.2.hmm.out.zip to Argot2.5
DEBUG [2021-10-17 13:02] Starting new HTTP connection (1): www.medcomp.medicina.unipd.it:80
DEBUG [2021-10-17 13:02] http://www.medcomp.medicina.unipd.it:80 "POST /Argot2-5/form_batch.php HTTP/1.1" 200 5453
DEBUG [2021-10-17 13:02] http://www.medcomp.medicina.unipd.it:80 "POST /Argot2-5/insert_batch.php HTTP/1.1" 200 1432
INFO [2021-10-17 13:02] Submitting 0.3_GOMAP-input.1.tsv.zip and 0.3_GOMAP-input.1.hmm.out.zip to Argot2.5
DEBUG [2021-10-17 13:02] Starting new HTTP connection (1): www.medcomp.medicina.unipd.it:80
DEBUG [2021-10-17 13:02] http://www.medcomp.medicina.unipd.it:80 "POST /Argot2-5/form_batch.php HTTP/1.1" 200 5453
DEBUG [2021-10-17 13:02] http://www.medcomp.medicina.unipd.it:80 "POST /Argot2-5/insert_batch.php HTTP/1.1" 200 1439
INFO [2021-10-17 13:02] Running Pannzer
INFO [2021-10-17 13:02] /workdir/test/GOMAP-0.3_GOMAP-input/tmp/mixed-meth/pannzer/results/0.3_GOMAP-input.2_results.GO not present so running command
python run.py /workdir/test/GOMAP-0.3_GOMAP-input/tmp/mixed-meth/pannzer/conf/0.3_GOMAP-input.2.conf
INFO [2021-10-17 13:03] Step completed
INFO [2021-10-17 13:03] /workdir/test/GOMAP-0.3_GOMAP-input/tmp/mixed-meth/pannzer/results/0.3_GOMAP-input.1_results.GO not present so running command
python run.py /workdir/test/GOMAP-0.3_GOMAP-input/tmp/mixed-meth/pannzer/conf/0.3_GOMAP-input.1.conf
INFO [2021-10-17 13:03] Step completed
It seems odd that PANNZER is not produciung any output. The test output should be what we see below in the pannzer
.
cd GOMAP-singularity/test/GOMAP-0.3_GOMAP-input/tmp/mixed-meth/pannzer && find -type f
./results/0.3_GOMAP-input.2.clusters
./results/0.3_GOMAP-input.2_results.DE
./results/0.3_GOMAP-input.1_results.DE
./results/0.3_GOMAP-input.1.clusters
./results/0.3_GOMAP-input.2_results.GO
./results/0.3_GOMAP-input.1_results.GO
./conf/0.3_GOMAP-input.2.conf
./conf/0.3_GOMAP-input.1.conf
./blast/0.3_GOMAP-input.2.xml
./blast/0.3_GOMAP-input.1.xml
Do you want to setup a time to talk on a call to figure this out?
Please contact me at kokul@bioinformapping.com if that would work.
Best Kokul
I got the same error. But when I had a test run on source code downloaded from GOMAP repo, I found that those requests can be sent successfully after removing the content type of file object which is specified as 'text/plain' (at line# 145, 146 on run_argot2.py).
Just want things to be clear. yxl8241 should be my colleague. I am testing what yxl8241 said with a modified GOMAP sif.
Just tested with the modified GOMAP sif (with suggestions by yxl8241) twice: (i) test3: running test.sh and (ii) test4: running GOMAP steps with the test fasta as a regular input. All successful in all steps. Some minor differences between final outputs (aggregate.gaf) but I think that might be due to search results made by blast with split query files or not (which usually means different e-values).
Back to test1, for which I got one argot2.5 job submission email and one complete email, and no result files in pannzer folders. This is explainable (and my thanks to you and yxl8241): with test.sh, the input fasta was splitted into two files for processing, submission of the first one is OK and the second one failed. The mixmeth step simply stopped when some argot2.5 submission failed so the pannzer part was not invoked. Accordingly, for test3, I did recieve two job submission emails and two job complete emails. Also got result files in pannzer folders.
It seems puzzling to me that the same code can sometimes successfully submit files to the argot2.5 server (with xen VM, WSL VM, and real server). It also seems to me that the changes suggested by yxl8241 just work.
Hey @yxl8241 and @wdlingit,
Just made a PR in source code repository. I hope did it right though.
Hey @yxl8241 ,
Could you please send the PR to the dev
branch? I usually test it in the dev branch and then merge it to master. Unfortunately GH doesn't let me change the branch after you initiated the PR.
Best Kokul
I just made another attempt. Please let me know if it doesn't work.
Thanks @yxl8241, that works.
Describe the bug Tried the test step as described in https://bioinformapping.com/gomap/master/RUNNING.html and got error message in the mixmeth step: requests.exceptions.ConnectionError: ('Connection aborted.', error(104, 'Connection reset by peer'))
Input File The FASTA in downloaded test folder
GOMAP step that crashed (if applicable) mixmeth
Attach the output files (following https://bioinformapping.com/gomap/v1.3.5/RUNNING.html)
Intermediate outpul file (if applicable) FILE: logs/0.3_GOMAP-input-mixmeth.log INFO [2021-09-25 15:24] Starting to run the pipline for 0.3_GOMAP-input INFO [2021-09-25 15:24] Running mixed-method based annotations INFO [2021-09-25 15:24] Submitting the batch inputs to Argot2 INFO [2021-09-25 15:24] Submitting 0.3_GOMAP-input.1.tsv.zip and 0.3_GOMAP-input.1.hmm.out.zip to Argot2.5 DEBUG [2021-09-25 15:24] Starting new HTTP connection (1): www.medcomp.medicina.unipd.it:80 DEBUG [2021-09-25 15:24] http://www.medcomp.medicina.unipd.it:80 "POST /Argot2-5/form_batch.php HTTP/1.1" 200 5453
System Details
Additional context Actually I tried my dataset first and got the same error message at the same mixmeth step. So I tried the test case.