biothings / biothings_explorer

TRAPI service for BioThings Explorer
https://explorer.biothings.io
Apache License 2.0
10 stars 11 forks source link

CTD processing 2: batch-queries #584

Closed colleenXu closed 8 months ago

colleenXu commented 1 year ago

Intro: see intro section of https://github.com/biothings/BioThings_Explorer_TRAPI/issues/583#issue-1622873383. Originally noted in https://github.com/biothings/BioThings_Explorer_TRAPI/issues/558#issuecomment-1459097534

2. processing batch-queries correctly

The current x-bte-kgs-operations aren't written as batch-queries, even though the CTD API does allow batch-querying.

The problem is how BTE handles the batch-query responses. The API response is an array of associations (objects) - and each association matched to one of the input IDs. Each association has an "Input" field where the value is the matched input ID (all lowercase, has an ID-prefix for diseases (MESH or OMIM) and pathways (REACT or KEGG)).

However, BTE's default api-response-transform isn't correctly handling this - instead, it's linking the first input ID to every possible output ID.

Example:

Edit SmartAPI and run BTE locally In a local copy of the [SmartAPI yaml](https://github.com/NCATS-Tangerine/translator-api-registry/blob/master/CTD/smartapi.yaml), copy-paste the following into the `chemical2gene` operation. It's changing the `supportBatch` and `queryInputs` info. ``` - supportBatch: true useTemplating: true inputs: - id: MESH semantic: SmallMolecule outputs: - id: NCBIGene semantic: Gene parameters: inputType: chem inputTerms: "{{ queryInputs | joinSafe('|') }}" inputTermSearchType: directAssociations report: genes_curated format: json predicate: related_to response_mapping: "$ref": "#/components/x-bte-response-mapping/chemical2gene" ``` Set up a local instance of BTE to override and use your local copy of the CTD yaml. Then POST to that specific api (v1/smartapi/{id}/query endpoint): ``` { "message": { "query_graph": { "edges": { "e01": { "subject": "n0", "object": "n1", "predicates": ["biolink:related_to"] } }, "nodes": { "n0": { "ids": ["MESH:C006303", "MESH:D015250"], "categories": ["biolink:SmallMolecule"] }, "n1": { "categories": ["biolink:Gene"] } } } } } ```
CTD's raw response During execution, BTE should generate [this query with two input IDs](http://ctdbase.org/tools/batchQuery.go?inputType=chem&inputTerms=C006303|D015250&inputTermSearchType=directAssociations&report=genes_curated&format=json) to CTD. In CTD's raw response, some genes are only linked to the second ID D015250 / Aclarubicin, like PARP1. ``` { "CasRN": "57576-44-0", "ChemicalId": "D015250", "ChemicalName": "Aclarubicin", "GeneId": "142", "GeneSymbol": "PARP1", "Input": "d015250", "Organism": "Homo sapiens", "OrganismId": "9606", "PubMedIds": "20399885" }, ```
BTE's current flawed response BTE links every output gene with only the first ID C006303 / acivicin / `PUBCHEM.COMPOUND:294641`. It's easier to see through the console log: ``` bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:294641_&_n1-NCBIGene:836 has 4 +1ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:294641_&_n1-NCBIGene:1080 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:294641_&_n1-NCBIGene:10800 has 1 +1ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:294641_&_n1-NCBIGene:2678 has 3 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:294641_&_n1-NCBIGene:834 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:294641_&_n1-NCBIGene:841 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:294641_&_n1-NCBIGene:1676 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:294641_&_n1-NCBIGene:2623 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:294641_&_n1-NCBIGene:2950 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:294641_&_n1-NCBIGene:3145 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:294641_&_n1-NCBIGene:4778 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:294641_&_n1-NCBIGene:2908 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:294641_&_n1-NCBIGene:142 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:294641_&_n1-NCBIGene:6582 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:294641_&_n1-NCBIGene:6607 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:294641_&_n1-NCBIGene:6647 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:294641_&_n1-NCBIGene:331 has 1 +0ms ```
desired format for BTE's response Instead, BTE should correctly link each input ID / entity with its associations. The console log should look like this: * some results have the first input ID C006303 / acivicin / `PUBCHEM.COMPOUND:294641` * other results have the second input ID D015250 / Aclarubicin / `PUBCHEM.COMPOUND:451415` * PARP1 (NCBIGene:142) is only linked to the second ID: `PUBCHEM.COMPOUND:451415_&_n1-NCBIGene:142`. Most genes are linked to only one of the input IDs. ``` bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:294641_&_n1-NCBIGene:836 has 1 +1ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:294641_&_n1-NCBIGene:1080 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:294641_&_n1-NCBIGene:10800 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:294641_&_n1-NCBIGene:2678 has 3 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:451415_&_n1-NCBIGene:834 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:451415_&_n1-NCBIGene:836 has 3 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:451415_&_n1-NCBIGene:841 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:451415_&_n1-NCBIGene:1676 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:451415_&_n1-NCBIGene:2623 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:451415_&_n1-NCBIGene:2950 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:451415_&_n1-NCBIGene:3145 has 1 +1ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:451415_&_n1-NCBIGene:4778 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:451415_&_n1-NCBIGene:2908 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:451415_&_n1-NCBIGene:142 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:451415_&_n1-NCBIGene:6582 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:451415_&_n1-NCBIGene:6607 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:451415_&_n1-NCBIGene:6647 has 1 +0ms bte:biothings-explorer-trapi:QueryResult result ID: n0-PUBCHEM.COMPOUND:451415_&_n1-NCBIGene:331 has 1 +0ms ```
rjawesome commented 1 year ago

This should be able to be solved by a custom pairCurieWithAPIResponse function, I can work on this in the JQ and/or javascript transformer for CTD

rjawesome commented 1 year ago

Here is the pairCurieWithAPIResponse JQ solves this problem. reduce (.response | .[]) as $item ({}; .[generateCurie($edge.association.input_id; $item.Input | ascii_upcase)] = [] + .[generateCurie($edge.association.input_id; $item.Input | ascii_upcase)] + [$item]) | map_values([.]) Will push shortly to JQ branch but I would need to double check the "Input" field is present in all queries to CTD (this pair function could also be set in the yaml for an operation via transformer.pair_jq)

colleenXu commented 1 year ago

@tokebe

It's not clear to me how BTE will construct large batch-queries to CTD, and whether we'll need to make adjustments to BTE. I'm specifically thinking about:

Notes:

tokebe commented 1 year ago
colleenXu commented 1 year ago

Replying to @tokebe (thanks for the quick reply!) with my thoughts:

colleenXu commented 11 months ago

I think a safe batch-size is 80 IDs, assuming a 2048 character-max for the GET url.

Rough calculations

`2048 = a*x + (x-1) + b = (a+1)*x + (b-1)` Where: - `x` is the max number of IDs (round down to nearest integer) - `a` is the number of characters in each ID (in API's required format) - `b` is the number of characters in the rest of the url, which depends on the dataset/relationship and input ID namespaces - `a*x` is for all the ID characters, `(x-1)` is for all the pipe-delimiters The most crucial number is `a`. **The max number of characters for 1 input ID is 21 for REACT (Pathway) IDs.**

click to see character num for all input IDs

- 10 - **MESH IDs without prefix**: 1 (C or D) plus 9 characters max according to [bioregistry](https://bioregistry.io/registry/mesh) - **NCBIGene IDs without prefix, estimated**: the longest ID I found in my browser history is 9 characters, [106099062](https://www.ncbi.nlm.nih.gov/gene/106099062)). I'm estimating because [bioregistry](https://bioregistry.io/registry/ncbigene) doesn't give a character limit - 11 - **OMIM IDs with prefix, estimated**: 5 (`OMIM:`) + 6 characters, based on looking at the [new entries like 620637](https://omim.org/statistics/updates/2023/11)). I'm estimating because [bioregistry](https://bioregistry.io/registry/omim) doesn't give a character limit - 14 - **KEGG.PATHWAY IDs with custom prefix**: 5 (`KEGG:`) + 9 characters max, based on [bioregistry](https://bioregistry.io/registry/kegg.pathway) - 15 - **MESH IDs with prefix**: 5 (`MESH:`) + 10 (explained above) - 21 - **REACT IDs with prefix, estimated**: 6 (`REACT:`) + 15 characters, based on looking at the v86 (latest) new/updated topics and pathways like [REACT:R-HSA-9836573.1](https://reactome.org/content/detail/R-HSA-9836573) (Mitochondrial RNA degradation)

`b = 140` for the 1 x-bte operation that uses REACT IDs as input. For the 1 x-bte operation that uses REACT IDs as input. (An example GET url with 2 input IDs is: `http://ctdbase.org/tools/batchQuery.go?inputType=pathway&inputTermSearchType=directAssociations&report=genes_curated&format=json&inputTerms=REACT:R-HSA-5669034|REACT:R-HSA-5668541`) So the equation for this situation is: `2048 = (a+1)*x + (b-1) = (21+1)*x + (140-1) = 22*x + 139`, x ~ 86 Rounding down to the nearest ten gets 80.

colleenXu commented 11 months ago

@tokebe

I'm getting JQ-related errors when I try to test the batch-size limit, using the process in the next section.

  1. If I start with the main branches, things seem to work okay. 1 of the 4 sub-queries fails, but that kind of error seems to be happening on dev/ci when I'm not testing the batch-size limit too.
Recreating the error with a simpler example, not testing the batch-size-limit

Noticed on ci/dev instances, but not test/prod. No overrides, no batch-size-limit-testing adjustments done. TRAPI query: ``` { "message": { "query_graph": { "edges": { "e01": { "subject": "n0", "object": "n1", "predicates": ["biolink:related_to"] } }, "nodes": { "n0": { "ids": ["MESH:D020138"], "categories": ["biolink:Disease"] }, "n1": { "categories": ["biolink:Gene"] } } } } } ``` 2/3 subqueries fail with `Error: jq: error (at :0): Cannot iterate over null (null)`: see full console logs [ctd-error-1.txt](https://github.com/biothings/biothings_explorer/files/13510911/ctd-error-1.txt) Interestingly, I think those two sub-queries are returning 0 hits: [this](http://ctdbase.org/tools/batchQuery.go?inputType=disease&inputTermSearchType=directAssociations&report=genes_curated&format=json&inputTerms=MESH:C566403) and [this](http://ctdbase.org/tools/batchQuery.go?inputType=disease&inputTermSearchType=directAssociations&report=genes_curated&format=json&inputTerms=OMIM:603174), vs [the 3rd sub-query that has hits](http://ctdbase.org/tools/batchQuery.go?inputType=disease&inputTermSearchType=directAssociations&report=genes_curated&format=json&inputTerms=MESH:D020138)

  1. If I start with the dev branches, I encounter errors after doing the SmartAPI override (see step 6 in the next section). However, I also encounter this kind of error when I don't set the batch-size-limit (step 2) and when I use a simpler 2-ID query that normally works in dev (w/o the override).
recreating the problem with a simple query

Follow the steps in the next section, but don't set the batch-size-limit (step 2 in the next section) Then do the simple query that works in dev without the override: ``` { "message": { "query_graph": { "edges": { "e01": { "subject": "n0", "object": "n1", "predicates": ["biolink:related_to"] } }, "nodes": { "n0": { "ids": ["REACT:R-HSA-5669034", "REACT:R-HSA-5668541"], "categories": ["biolink:Pathway"] }, "n1": { "categories": ["biolink:Gene"] } } } } } ``` I'd normally get 134 results, but instead I get 0 results. In the console logs, the sub-query fails with `Error: jq: error (at :0): explode input must be a string`. The full console logs are: [simple-ctd-error-dev.txt](https://github.com/biothings/biothings_explorer/files/13511334/simple-ctd-error-dev.txt)


My full process to test the batch-size-limit

1. Setup: Check out the right branches (either main or dev), `pnpm i`.

2. Adding the batch-size limit to the query-handler's config

To [API_BATCH_SIZE](https://github.com/biothings/bte_trapi_query_graph_handler/blob/c4eb2bb1e2bcc54f60858584dc0dcf71692b78f0/src/config.ts#L1), add: ``` { id: '0212611d1c670f9107baf00b77f0889a', name: 'CTD API', max: 80, }, ```

3. Setting an override to use CTD x-bte annotation for batch-querying

I actually override to my local file with the branch checked out, but this should do the same thing. Paste into [BTE's smartapi_overrides file](https://github.com/biothings/bte-server/blob/main/src/config/smartapi_overrides.json), so [it'll use this x-bte annotation](https://github.com/NCATS-Tangerine/translator-api-registry/blob/ctd-batch-query/CTD/smartapi.yaml): ``` { "conf": { "only_overrides": true }, "apis": { "0212611d1c670f9107baf00b77f0889a": "https://raw.githubusercontent.com/NCATS-Tangerine/translator-api-registry/ctd-batch-query/CTD/smartapi.yaml" } } ```

4. `pnpm build`, then `API_OVERRIDE=true pnpm run smartapi_sync` to set up BTE with the changes and get the x-bte info 5. Run BTE, then query CTD thru BTE (`http://localhost:3000/v1/smartapi/0212611d1c670f9107baf00b77f0889a/query`) with this request body [trapi_300react.txt](https://github.com/biothings/biothings_explorer/files/13510446/trapi_300react.txt). It's a TRAPI query for 300 REACT IDs (Pathway) -> Gene. BTE then runs 4 sub-queries, which is correct (3*80 + 60). 1. Note: All the IDs are real IDs for human pathways (from [Reactome](https://reactome.org/download-data)'s Complete List of Pathways), but CTD may not have data for them. 6. If I started with dev instances and run that query, all the sub-queries fail with the message `The error is Error: jq: error (at :0): explode input must be a string` Full console logs: [console-300react.txt](https://github.com/biothings/biothings_explorer/files/13510534/console-300react.txt)
Console log of a sub-query

``` bte:call-apis:query using template builder +0ms bte:call-apis:query query success, transforming hits->records... +0ms bte:api-response-transform:index api name CTD API +0ms bte:api-response-transform:index api tags: translator,ctd +0ms bte:call-apis:query Failed to make to following query: {"url":"http://ctdbase.org/tools/batchQuery.go","params":{"inputType":"pathway","inputTerms":"REACT:R-HSA-446193|REACT:R-HSA-196780|REACT:R-HSA-9636467|REACT:R-HSA-9033658|REACT:R-HSA-70895|REACT:R-HSA-352238|REACT:R-HSA-168302|REACT:R-HSA-162588|REACT:R-HSA-450385|REACT:R-HSA-8851680|REACT:R-HSA-5621481|REACT:R-HSA-75102|REACT:R-HSA-5218900|REACT:R-HSA-9662834|REACT:R-HSA-5621575|REACT:R-HSA-5690714|REACT:R-HSA-389356|REACT:R-HSA-389357|REACT:R-HSA-389359|REACT:R-HSA-9013148|REACT:R-HSA-68689|REACT:R-HSA-9833576|REACT:R-HSA-69017|REACT:R-HSA-447041|REACT:R-HSA-5607763|REACT:R-HSA-5607764|REACT:R-HSA-5660668|REACT:R-HSA-6811434|REACT:R-HSA-6811436|REACT:R-HSA-6807878|REACT:R-HSA-204005|REACT:R-HSA-140180|REACT:R-HSA-199920|REACT:R-HSA-442742|REACT:R-HSA-442720|REACT:R-HSA-442729|REACT:R-HSA-8874211|REACT:R-HSA-399956|REACT:R-HSA-2024101|REACT:R-HSA-389513|REACT:R-HSA-5358747|REACT:R-HSA-5358749|REACT:R-HSA-5358751|REACT:R-HSA-5358752|REACT:R-HSA-211999|REACT:R-HSA-111996|REACT:R-HSA-1296052|REACT:R-HSA-4086398|REACT:R-HSA-111997|REACT:R-HSA-111932|REACT:R-HSA-2025928|REACT:R-HSA-419812|REACT:R-HSA-111933|REACT:R-HSA-901042|REACT:R-HSA-111957|REACT:R-HSA-72737|REACT:R-HSA-8955332|REACT:R-HSA-5576891|REACT:R-HSA-9733709|REACT:R-HSA-5694530","inputTermSearchType":"directAssociations","report":"genes_curated","format":"json"},"method":"get","timeout":50000,"headers":{"User-Agent":"BTE/dev Node/v18.16.1 darwin"}}. The error is Error: jq: error (at :0): explode input must be a string bte:call-apis:query with Error: jq: error (at :0): explode input must be a string bte:call-apis:query bte:call-apis:query at ChildProcess. (/Users/colleenxu/Desktop/BTE_typescript_pnpm/biothings_explorer/node_modules/.pnpm/node-jq@4.2.2/node_modules/node-jq/lib/exec.js:31:35) bte:call-apis:query at ChildProcess.emit (node:events:513:28) bte:call-apis:query at ChildProcess.emit (node:domain:489:12) bte:call-apis:query at maybeClose (node:internal/child_process:1091:16) bte:call-apis:query at ChildProcess._handle.onexit (node:internal/child_process:302:5) bte:call-apis:query at Process.callbackTrampoline (node:internal/async_hooks:130:17) +24ms ```

tokebe commented 11 months ago

Looks like this is a problem in the JQ string, largely due to CTD's inconsistent response structure depending on if anything was found or not. Working on a fix...

tokebe commented 11 months ago

Ok, turns out this was less CTD's inconsistencies and more JQ's inconsistencies (and my lack of familiarity...). I've pushed a fix to dev which should address this.

colleenXu commented 11 months ago

The fix worked!

I tested all 3 example queries in my previous post in both dev and main (CI) branches. Everything worked as-intended without any errors.

The PRs to deploy are:

colleenXu commented 11 months ago

Update!

I've included the CTD x-bte changes in the overrides https://github.com/biothings/bte-server/pull/4 - so it'll deploy alongside the orphanet changes. I think the override will end up deploying with or after the code changes (JQ / batch-size-limit), so I don't anticipate any issues. (aka I think NodeNorm will deploy the orphanet changes at the same pace or slower than our deployments to instances).

colleenXu commented 11 months ago

I think we can close this issue once:

We'll then have a separate process to remove the overrides (not needed once the yaml PRs are all merged / registrations refreshed).

colleenXu commented 10 months ago

@tokebe

I double-checked and it's not working on CI, probably because of the larger cache-update issues (recent lab Slack convo)

My test

POST to CTD through BTE CI `https://bte.ci.transltr.io/v1/smartapi/0212611d1c670f9107baf00b77f0889a/query` ``` { "message": { "query_graph": { "edges": { "e01": { "subject": "n0", "object": "n1", "predicates": ["biolink:related_to"] } }, "nodes": { "n0": { "ids": ["KEGG.PATHWAY:hsa05323", "KEGG.PATHWAY:hsa04917"], "categories": ["biolink:Pathway"] }, "n1": { "categories": ["biolink:Gene"] } } } } } ``` Based on the logs in the TRAPI response, I can tell that 2 sub-queries were sent (1 ID each). But if batch-querying was working, only 1 sub-query should have been sent. This may mean BTE CI didn't successfully use the override. ``` { "timestamp": "2023-12-16T06:08:27.395Z", "level": "DEBUG", "message": "call-apis: 2 planned queries for edge e01", "code": null }, { "timestamp": "2023-12-16T06:08:27.792Z", "level": "DEBUG", "message": "Successful GET http://ctdbase.org (1 ID): Pathway > has_participant > Gene (obtained 70 records, took 121ms)", "code": null }, { "timestamp": "2023-12-16T06:08:27.808Z", "level": "DEBUG", "message": "Successful GET http://ctdbase.org (1 ID): Pathway > has_participant > Gene (obtained 89 records, took 178ms)", "code": null }, ```

tokebe commented 10 months ago

Issue should now be addressed by https://github.com/biothings/biothings_explorer/commit/3019cecf670e5b0fc04877c31956b2bbbc3d7e4e, please test again

colleenXu commented 10 months ago

Now it's working on BTE CI! Yay!

The previous test now works as-intended - with 1 planned batch-query. Logs:

        {
            "timestamp": "2023-12-18T21:40:08.965Z",
            "level": "DEBUG",
            "message": "call-apis: 1 planned queries for edge e01",
            "code": null
        },
        {
            "timestamp": "2023-12-18T21:40:09.492Z",
            "level": "DEBUG",
            "message": "Successful GET http://ctdbase.org (2 IDs): Pathway > has_participant > Gene (obtained 159 records, took 181ms)",
            "code": null
        },

I also tested the batch-size-limit=80 with a 150-QNode-IDs query (current max, see #762), and it worked too. Two sub-queries were sent (80 + 70)

Batch-size-limit test

POST to CTD through BTE CI `https://bte.ci.transltr.io/v1/smartapi/0212611d1c670f9107baf00b77f0889a/query` using the attached JSON as the request body: [CTD-150ReactIDs.txt](https://github.com/biothings/biothings_explorer/files/13708873/CTD-150ReactIDs.txt) Logs show that two sub-queries were sent (80 + 70), so the batch-size-limit of 80 was respected ``` { "timestamp": "2023-12-18T21:41:52.878Z", "level": "DEBUG", "message": "call-apis: 2 planned queries for edge e01", "code": null }, { "timestamp": "2023-12-18T21:42:02.309Z", "level": "DEBUG", "message": "Successful GET http://ctdbase.org (80 IDs): Pathway > has_participant > Gene (obtained 1703 records, took 195ms)", "code": null }, { "timestamp": "2023-12-18T21:42:02.344Z", "level": "DEBUG", "message": "Successful GET http://ctdbase.org (70 IDs): Pathway > has_participant > Gene (obtained 2603 records, took 290ms)", "code": null }, ```

colleenXu commented 8 months ago

I've confirmed that things work as-expected after the Prod deployment. Closing issue, updating the registered yamls and registrations, and opening another issue for removing the overrides.

Example: POST to https://bte.transltr.io/v1/smartapi/0212611d1c670f9107baf00b77f0889a/query, will get a response with results and a log saying Successful GET http://ctdbase.org (2 IDs): Pathway > has_participant > Gene (obtained 159 records, took 215ms). This shows that the batch-query occurred.

{
    "message": {
        "query_graph": {
            "edges": {
                "e01": {
                    "subject": "n0",
                    "object": "n1",
                    "predicates": ["biolink:related_to"]
                }
            },
            "nodes": {
                "n0": {
                    "ids": ["KEGG.PATHWAY:hsa05323", "KEGG.PATHWAY:hsa04917"],
                    "categories": ["biolink:Pathway"]
                },
                "n1": {
                    "categories": ["biolink:Gene"]
                }
            }
        }
    }
}