Closed DSuveges closed 1 year ago
As reported by a user, another query shows a similar behaviour:
query study {
studiesForGene(geneId:"ENSG00000169174") {
study {
source
pmid
pubDate
pubJournal
pubTitle
pubAuthor
hasSumstats
nInitial
nReplication
nCases
traitCategory
numAssocLoci
}
}
}
Most likely the problem has a common underlying issue. Once this issue is fixed, both endpoint will work just fine.
It isn't really clear what is going on. Each CH instance is given it's own complete copy of the database on start-up. The instances running in the US and in our development environments return the expected results. The instance in the EU contains errors.
They are all using the same disk image ch-disk-jldgdj65-image-v2
to start the database. When I ssh into the EU node and go into the database I can see:
SELECT
name,
code,
value,
last_error_message
FROM system.errors
WHERE value > 0
ORDER BY code ASC
Query id: b15dc9a7-c090-4cff-9a49-5a6aa19c45cc
┌─name──────────────┬─code─┬─value─┬─last_error_message──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ CANNOT_OPEN_FILE │ 76 │ 1 │ Cannot open certificate file: /etc/clickhouse-server/server.crt. │
│ FILE_DOESNT_EXIST │ 107 │ 1 │ Cannot open file /var/lib/clickhouse/store/e9d/e9d00a82-974a-4cbb-b378-ba608f972522/all_4951_5496_4/lead_pos.bin, errno: 2, strerror: No such file or directory │
└───────────────────┴──────┴───────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
The FILE_DOESNT_EXIST
error is what is causing the problem in the API.
Compared with the US:
SELECT
name,
code,
value,
last_error_message
FROM system.errors
WHERE value > 0
ORDER BY code ASC
Query id: ec88c3c2-f56b-426b-a389-70868ac584b4
┌─name─────────────┬─code─┬─value─┬─last_error_message───────────────────────────────────────────────┐
│ CANNOT_OPEN_FILE │ 76 │ 1 │ Cannot open certificate file: /etc/clickhouse-server/server.crt. │
└──────────────────┴──────┴───────┴──────────────────────────────────────────────────────────────────┘
When the instance starts up it mounts the disk and then starts Clickhouse in a Docker container. My best guess is that while it is mounting the data something goes wrong. I just can't see why it would effect some instances and not others.
If the data got somehow corrupted, wouldn't we expect missing data/tables from the UI? However https://genetics.opentargets.org/gene/ENSG00000169174
and https://genetics.dev.opentargets.xyz/gene/ENSG00000169174
show the same table with same values?
(assuming studiesForGene
and studiesAndLeadVariantsForGene
fills tables on the gene page)
The assumption is wrong (as far as I can tell) :smile: .
That page appears to be using the geneInfo
and colocalisationsForGene
endpoints. I'm not actually sure where the endpoint is used on the FE (if at all). There are enough 'artifacts' in the genetics portal that I'm never surprised to find dead-ends.
As @d0choa pointed out these endpoints are not used anywhere in the UI, they are only listed in the API documentation as examples.
I'm glad we spent the day fixing that then...somewhat deflating.
Yesterday, another user reported on Community re problem with the studyandLeadVariantsForGene
API example query.
As mentioned in one of the comments above, this example query works on the dev instance but not in production.
Discussed with Daniel S. As an easy fix, we suggest we remove this example query from the Genetics API doc page. The other example queries work as expected. @chinmehta can you help removing the example query please, as you have implemented the page originally? Please do let me know if you have any Qs on this.
We still have an issue in the API, but we will ignore it from now as it's not used in production
Description of the bug
Our users reported on the community portal that the
studiesAndLeadVariantsForGene
GraphQL endpoint in the genetics portal returns no data. Whatever gene ids we are requesting in the following graphQL request, the response is always empty.Returned data:
Interestingly the same request returns correct dataset in the dev instance.
Expected behaviour The query should return something like this: