CanDIG / CanDIGv2

The CanDIG v2 platform
GNU Lesser General Public License v3.0
15 stars 8 forks source link

HTSget integration test failure on stable branches #237

Closed justin-ys closed 10 months ago

justin-ys commented 1 year ago

Please add the results of the following commands:

git status

On branch develop
Your branch is up to date with 'origin/develop'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
  (commit or discard the untracked or modified content in submodules)
    modified:   lib/candig-data-portal/candig-data-portal (new commits, modified content)
    modified:   lib/federation/federation (modified content, untracked content)
    modified:   lib/htsget/htsget_app (untracked content)
    modified:   lib/katsu/katsu_service (new commits, modified content)
    modified:   lib/opa/opa (new commits)

Untracked files:
  (use "git add <file>..." to include in what will be committed)
    authx.txt
    federation.txt
    htsget.txt
    integration.txt
    katsu.html
    lib/federation-service/
    opa.txt
    rebuild_htsget.sh
    rebuild_katsu.sh
    tyk.txt

no changes added to commit (use "git add" and/or "git commit -a")

git submodule status

+e94f31f3ec3662fe8a1a95f4d46a7a7c5ccbaa3f lib/candig-data-portal/candig-data-portal (v0.1.8-10-ge94f31f)
 205497c40440501f64aafcd9a6ca6f0fd9820e0d lib/federation/federation (v0.5.5-76-g205497c)
 10a3e40019a4d2efd387bb7d7a4fa33b357664a3 lib/htsget/htsget_app (v2.0.1)
+f0688716b0582f1e68eb10ce71529ad3f51577ea lib/katsu/katsu_service (v1.5.1-1217-gf0688716)
+5958ea194e17a301e67ccf49831014ca07b9128d lib/opa/opa (v1.3.2-4-g5958ea1)
 b0ff5be051f2fd55352e00450b7848dcf8354a3b lib/toil/toil-docker (releases/5.5.0)

cat tmp/error.txt

https://gist.github.com/jman005/1c977aad3f6374c0bfb902e5191cd540

docker ps

CONTAINER ID   IMAGE                              COMMAND                  CREATED        STATUS                   PORTS                                                           NAMES
4db5b0c94269   candigv2_vault-runner              "bash /app/entrypoin…"   2 hours ago    Up 2 hours                                                                               candigv2_vault-runner_1
234eb213ca0a   hashicorp/vault:1.13               "docker-entrypoint.s…"   2 hours ago    Up 2 hours (unhealthy)   0.0.0.0:8200->8200/tcp, :::8200->8200/tcp                       candigv2_vault_1
f42a3503ef43   candig/opa-runner:v1.3.2           "bash /app/entrypoin…"   2 hours ago    Up 2 hours                                                                               candigv2_opa-runner_1
566248baa7a6   openpolicyagent/opa:latest         "/opa run --server -…"   2 hours ago    Up 2 hours (unhealthy)   0.0.0.0:8181->8181/tcp, :::8181->8181/tcp                       candigv2_opa_1
3d33b349d57b   candigv2_tyk                       "/opt/tyk-gateway/ty…"   2 hours ago    Up 2 hours (healthy)     0.0.0.0:5080->8080/tcp, :::5080->8080/tcp                       candigv2_tyk_1
d2630fb93171   redis:5.0-alpine                   "docker-entrypoint.s…"   2 hours ago    Up 2 hours (healthy)     0.0.0.0:6379->6379/tcp, :::6379->6379/tcp                       candigv2_tyk-redis_1
5eff7457ab07   candigv2_keycloak                  "/opt/jboss/tools/do…"   2 hours ago    Up 2 hours (healthy)     0.0.0.0:8080->8080/tcp, :::8080->8080/tcp, 8443/tcp             candigv2_keycloak_1
927a496899ba   candig/candig-data-portal:v0.1.8   "bash entrypoint.sh"     2 hours ago    Up 2 hours (unhealthy)   0.0.0.0:2543->2543/tcp, :::2543->2543/tcp                       candigv2_candig-data-portal_1
e81af7ccdd29   candig/katsu:v2.0.0                "/app/chord_metadata…"   2 hours ago    Up 2 hours               0.0.0.0:8008->8000/tcp, :::8008->8000/tcp                       candigv2_katsu_1
982555b669b5   postgres:15-alpine                 "docker-entrypoint.s…"   2 hours ago    Up 2 hours               0.0.0.0:5433->5432/tcp, :::5433->5432/tcp                       candigv2_metadata-db_1
457eecca65b9   candig/htsget:v2.0.1               "bash entrypoint.sh"     2 hours ago    Up 2 hours               0.0.0.0:3333->3000/tcp, :::3333->3000/tcp                       candigv2_htsget_1
95706c7a9378   minio/minio:latest                 "/usr/bin/docker-ent…"   2 hours ago    Up 2 hours (healthy)     0.0.0.0:9000-9001->9000-9001/tcp, :::9000-9001->9000-9001/tcp   candigv2_minio_1
4520abf2f284   candig/federation:v1.0.0           "bash entrypoint.sh"     4 hours ago    Up 4 hours               0.0.0.0:4232->4232/tcp, :::4232->4232/tcp                       candigv2_federation_1
2bbc8f32eacf   candig/candig-data-portal:v0.1.8   "bash entrypoint.sh"     26 hours ago   Up 26 hours                                                                              wizardly_bhabha

make test-integration

https://gist.github.com/jman005/b3838ba2785f9805c47697bae1738304
daisieh commented 1 year ago

Your first failure seems to be on the basic tyk test. Does the command always fail?

curl "http://candig.docker.internal:3333/ga4gh/drs/v1/service-info"

curl "http://candig.docker.internal:5080/genomics/ga4gh/drs/v1/service-info" \
     -H 'Authorization: Bearer <site admin token>'
daisieh commented 1 year ago

(also please paste the results from the tyk log of that last command, the curl for 5080)

daisieh commented 1 year ago

To streamline this sort of debugging, I added some print statements. Can you pull the branch from https://github.com/CanDIG/CanDIGv2/pull/238 on top of your stack and then give me the results from make test-integration?

justin-ys commented 1 year ago

The two Tyk commands are working and the Tyk-related tests now seem to pass after a rebuild and waiting a little. Here's the result of test-integration using the new branch: https://gist.github.com/jman005/b3838ba2785f9805c47697bae1738304

daisieh commented 1 year ago

What are the changes in your opa submodule? Those are the next tests that are failing, and I can't tell what changes you've made.

daisieh commented 1 year ago

Also your vault seems to have sealed itself; you will have to unseal it. You should be able to do so by restarting the vault-runner container.

daisieh commented 1 year ago

If that doesn't fix it, I would need to see what is in your vault log and how it's failing to unseal.

justin-ys commented 1 year ago

Rebuilding once again seems to have fixed the vault seal. There are no changes to opa, I'm really not sure why git status is showing that. git branch in the opa directory shows that it is on main and git stash shows no local changes. Here is the most recent log: https://gist.github.com/jman005/b3838ba2785f9805c47697bae1738304 (I'm just using the same link and editing it by the way)

daisieh commented 1 year ago

I think that maybe the paths in Opa are out of date and aren't matching what we expect in the test?

What are the values you get when you do:

## opa paths
curl -X "POST" "http://candig.docker.internal:5080/policy/v1/data/paths" \
     -H 'Authorization: Bearer <site admin token>' \
     -H 'Content-Type: application/json; charset=utf-8' \
     -d $'{}'
justin-ys commented 1 year ago

My output is {"result":{"get":["/v2/discovery/?.*","/v2/authorized/?.*","/htsget/v1/variants/?.*","/htsget/v1/variants/search","/htsget/v1/variants/?.*/index","/htsget/v1/reads/?.*","/ga4gh/drs/v1/objects/?.*","/ga4gh/drs/v1/datasets/?.*","/beacon/v2/g_variants/?.*"],"post":["/htsget/v1/variants/search","/htsget/v1/variants/?.*/index","/beacon/v2/g_variants/?.*"]},"warning":{"code":"api_usage_warning","message":"'input' key missing from the request"}}

daisieh commented 1 year ago

What does it say in line 83 of your test_integration.py? Mine says

    payload = {"input": {"body": {"path": "/v2/discovery/", "method": "GET"}}}
justin-ys commented 1 year ago

Mine is response = requests.post(f"{ENV['CANDIG_ENV']['OPA_URL']}/v1/data/permissions/datasets", json=payload, headers=headers)

daisieh commented 1 year ago

That looks like my line 93, not 83. There should be something about a payload that looks like what I posted.

justin-ys commented 1 year ago

The problem turned out to be a merge conflict. New log after the fix is at the same gist link.

daisieh commented 1 year ago

UGH now I really don't know. HTSGet is acting like it thinks user2 is not a site admin, but the test_site_admin test is before that and passes just fine...

daisieh commented 1 year ago

Following up...this is working for you now, correct?

justin-ys commented 1 year ago

Only on my branch with the new auth model, otherwise they still fail.

daisieh commented 1 year ago

And the failure is that htsget can't authorize site admins, but the test_site_admin test in test_integration works fine?

justin-ys commented 1 year ago

Yes, that seems to be the case. (Additionally the "add sample to dataset" test also fails) Notably the site admin test fails with a 404, so it seems the problem isn't admin authentication but failing to get a certain htsget endpoint (the logs seem to indicate /genomics/htsget/v1/v1/variants/data/{obj})

daisieh commented 1 year ago

There seems to be a typo in that print statement (but not the one that is the actual assertion) on line 299 of test_integration.py: it should be print(f"{ENV['CANDIG_URL']}/genomics/htsget/v1/variants/data/{obj}") but that doesn't matter since it's not actually using that URL.

daisieh commented 1 year ago

One last thing: I am noticing that the failure you have in htsget's tests happen on line 115 of the test code. That's not the current version of htsget. You also seem to be building on the old version as well. Can you make sure that your submodule of htsget is pulled to v2.1.0, sha c92359e? That is the version that is in develop of CanDIGv2.

=================================== FAILURES ===================================
______________________________ test_post_objects _______________________________

drs_objects = [{'access_methods': [{'access_id': 'candig_docker_internal_9000/testhtsget/NA18537.vcf.gz.tbi', 'type': 's3'}], 'alias...gz.tbi'], 'id': 'index', 'name': 'multisample_1.vcf.gz.tbi'}], 'created_time': '2021-09-27T18:40:00.538843', ...}, ...]

>   ???
E   assert 403 == 200
E    +  where 403 = <Response [403]>.status_code

/home/justin/Programs/CanDIGv2/lib/htsget/htsget_app/tests/test_htsget_server.py:115: AssertionError
----------------------------- Captured stdout call -----------------------------
POST NA18537.vcf.gz.tbi: {
  "message": "User is not authorized to POST"
}
justin-ys commented 1 year ago

Git insists the submodule is on exactly that version/commit, although when I stashed it did grab some changes for some reason. However even after rebuilding without them I get the same errors.