ncbi / pgap

NCBI Prokaryotic Genome Annotation Pipeline
Other
316 stars 88 forks source link

Unable to read test genome in input.yml file: undefined reference. #184

Closed emilyjunkins closed 2 years ago

emilyjunkins commented 2 years ago

Hello,

I am running PGAP (2022-02-10.build5872) on docker ( version 20.10.10, build b485636) on Mac Catalina (10.15.7).

I am running the test genome after installation and getting a similar problem described in this issue. I checked the fasta format in both test genome directories (test_genomes and test_genomes-2022-02-10.build5872), both of which returned the same error:

Original command: ./pgap.py --cpus 8 -m 16g -r -v -o mg37_results test_genomes/MG37/input.yaml

Docker command: /usr/local/bin/docker run -i --rm --user 507:20 --volume /Users/ejunkins/pgap/input-2022-02-10.build5872:/pgap/input:ro,z --volume /Users/ejunkins/pgap/test_genomes/MG37:/pgap/user_input:z --volume /Users/ejunkins/pgap/test_genomes/MG37/pgap_input_yw2xm6zp.yaml:/pgap/user_input/pgap_input.yaml:ro,z --volume /var/folders/zv/drcf9r0n4b9c6x_1np1xdvdr0000gv/T/:/tmp:rw,z --volume /Users/ejunkins/pgap/mg37_results:/pgap/output:rw,z --memory 16g ncbi/pgap:2022-02-10.build5872 /bin/taskset -c 0-7 cwltool --timestamps --debug --disable-color --preserve-entire-environment --outdir /pgap/output pgap/pgap.cwl /pgap/user_input/pgap_input.yaml

--- Start YAML Input ---
fasta:
  class: File
  location: ASM2732v1.annotation.nucleotide.1.fasta
submol:
  class: File
  location: pgap_submol_j0zgxy6u.yaml
supplemental_data: { class: Directory, location: /pgap/input }
report_usage: true
--- End YAML Input ---

--- Start Runtime Report ---
{
    "CPU cores": 8,
    "Docker image": "ncbi/pgap:2022-02-10.build5872",
    "cpu flags": "fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht pbe syscall nx pdpe1gb lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid pni pclmulqdq dtes64 ds_cpl ssse3 sdbg fma cx16 xtpr pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch fsgsbase bmi1 hle avx2 bmi2 erms rtm xsaveopt arat",
    "cpu model": "Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz",
    "max user processes": "unlimited",
    "memory (GiB)": 31.4,
    "memory per CPU core (GiB)": 3.9,
    "open files": 1048576,
    "tmp disk space (GiB)": 44.5,
    "virtual memory": "unlimited",
    "work disk space (GiB)": 1393.3
}
--- End Runtime Report ---

[2022-02-18 22:54:29] INFO /pgap/venv/bin//cwltool 3.1.20220210171524
[2022-02-18 22:54:29] INFO Resolved 'pgap/pgap.cwl' to 'file:///pgap/pgap/pgap.cwl'
pgap/pgap.cwl:22:7: Warning: Field `location` contains undefined reference to
                    `file:///pgap/pgap/input`

I have also run this with -d but there is no output in debug/log directory.

Could someone help me with this?

Thanks!

azat-badretdin commented 2 years ago

Thank you for your report!

Do you have anything in output/debug/tmp* files, would you mind posting the listing (ls -Raltr output/debug)? Thanks

azat-badretdin commented 2 years ago

getting a similar problem described in this https://github.com/ncbi/pgap/issues/147.

Would you mind clarifying in what way it is similar? It seems like cases reported in issue #147 advanced further in execution compared to yours

azat-badretdin commented 2 years ago

undefined reference

This is just a warning because of the nature of how things work with given input files from the test case. The reason for failure is elsewhere.

emilyjunkins commented 2 years ago

Hi!

The debug file is completely empty, run from output/debug:

ls -Raltr 
total 0
drwxr-xr-x  2 ejunkins  staff   64 Feb 22 13:21 log
drwxr-xr-x  3 ejunkins  staff   96 Feb 22 13:21 .
drwxr-xr-x  4 ejunkins  staff  128 Feb 22 13:21 ..

./log:
total 0
drwxr-xr-x  2 ejunkins  staff  64 Feb 22 13:21 .
drwxr-xr-x  3 ejunkins  staff  96 Feb 22 13:21 ..

As far as the issue #147, it is only similar in that the log file was similar. I see that it was run on a user file and not the test file like mine, which has the correct fasta format.

azat-badretdin commented 2 years ago

Thank you! We will try to reproduce this on our end after finding a suitable host.

emilyjunkins commented 2 years ago

Hi @azat-badretdin ! I am following up on this issue while I am thinking about it. Please let me know if you need anything from me (input file, machine specs, etc.). Thanks!

azat-badretdin commented 2 years ago

Thanks for offering help! No info needed at this point, the ball is on our side.

azat-badretdin commented 2 years ago

Have you tried to run pgap.py without additional parameters, i.e. exactly as it was specified in Quick Start notes?

azat-badretdin commented 2 years ago

Another thing to try is to check permissions. Could you please run this command:


docker run -i --user $(id -u):$(id -g) --volume $HOME/:/test:rw,z ncbi/pgap:2022-02-10.build5872 cp /test/.zshrc /test/i_can_copy

and see if you have ~/i_can_copy file

azat-badretdin commented 2 years ago

Also: FYI: we set up a test on Catalina machine and were not able to reproduce the problem

emilyjunkins commented 2 years ago

Hi @azat-badretdin, thank you for patience with my getting back to this issue.

I ran the command as is in the Quick start and get the following error in the cwltool.log file:

cat cwltool.log 
Original command: ./pgap.py -r -o mg37_results test_genomes/MG37/input.yaml

Docker command: /usr/local/bin/docker run -i --rm --user 507:20 --volume /Users/ejunkins/pgap/input-2022-02-10.build5872:/pgap/input:ro,z --volume /Users/ejunkins/pgap/test_genomes/MG37:/pgap/user_input:z --volume /Users/ejunkins/pgap/test_genomes/MG37/pgap_input_3x_ia4b5.yaml:/pgap/user_input/pgap_input.yaml:ro,z --volume /var/folders/zv/drcf9r0n4b9c6x_1np1xdvdr0000gv/T/:/tmp:rw,z --volume /Users/ejunkins/pgap/mg37_results:/pgap/output:rw,z ncbi/pgap:2022-02-10.build5872 cwltool --timestamps --debug --disable-color --preserve-entire-environment --outdir /pgap/output pgap/pgap.cwl /pgap/user_input/pgap_input.yaml

--- Start YAML Input ---
fasta:
  class: File
  location: ASM2732v1.annotation.nucleotide.1.fasta
submol:
  class: File
  location: pgap_submol_qs4dpy95.yaml
supplemental_data: { class: Directory, location: /pgap/input }
report_usage: true
--- End YAML Input ---

--- Start Runtime Report ---
{
    "CPU cores": 8,
    "Docker image": "ncbi/pgap:2022-02-10.build5872",
    "cpu flags": "fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht pbe syscall nx pdpe1gb lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid pni pclmulqdq dtes64 ds_cpl ssse3 sdbg fma cx16 xtpr pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch fsgsbase bmi1 hle avx2 bmi2 erms rtm xsaveopt arat",
    "cpu model": "Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz",
    "max user processes": "unlimited",
    "memory (GiB)": 31.4,
    "memory per CPU core (GiB)": 3.9,
    "open files": 1048576,
    "tmp disk space (GiB)": 44.5,
    "virtual memory": "unlimited",
    "work disk space (GiB)": 1384.5
}
--- End Runtime Report ---

[2022-04-13 20:45:53] INFO /pgap/venv/bin/cwltool 3.1.20220210171524
[2022-04-13 20:45:53] INFO Resolved 'pgap/pgap.cwl' to 'file:///pgap/pgap/pgap.cwl'
pgap/pgap.cwl:22:7: Warning: Field `location` contains undefined reference to
                    `file:///pgap/pgap/input`

I then checked my permissions and it may be the problem:

docker run -i --user $(id -u):$(id -g) --volume $HOME/:/test:rw,z ncbi/pgap:2022-02-10.build5872 cp /test/.zshrc /test/i_can_copy

cp: cannot stat '/test/.zshrc': No such file or directory
azat-badretdin commented 2 years ago

Thanks!

Could you please modify the test to do


touch ~/test.my.docker; docker run -i --user $(id -u):$(id -g) --volume $HOME/:/test:rw,z ncbi/pgap:2022-02-10.build5872 cp /test/test.my.docker /test/i_can_copy