Open alisman opened 4 days ago
This curl doesn't seem to produce any discrepancy. I think it is missing the UNKNOWN
filter.
Updating the filters to something like this reproduces the issue:
{"clinicalDataFilters":[{"attributeId":"CANCER_TYPE_DETAILED","values":[{"value":"BREAST INVASIVE DUCTAL CARCINOMA"}]},{"attributeId":"AGE_AT_SEQ_REPORT","values":[{"start":80,"end":85},{"start":85},{"value":"UNKNOWN"}]}],"studyIds":["genie_public"],"alterationFilter":{"copyNumberAlterationEventTypes":{"AMP":true,"HOMDEL":true},"mutationEventTypes":{"any":true},"structuralVariants":null,"includeDriver":true,"includeVUS":true,"includeUnknownOncogenicity":true,"includeUnknownTier":true,"includeGermline":true,"includeSomatic":true,"includeUnknownStatus":true,"tiersBooleanMap":{}}}
@onursumer any idea why it happens?
Still looking into it. It happens when UNKNOWN
is selected together with numerical values. It doesn't happen only UNKNOWN
is selected.
@alisman Looks like we are not properly handling the case where numerical and categorical filters are together in the same ClinicalDataFilter.
We have a method to check whether a filter is catergorical or not, but it only checks the first filter in the list. https://github.com/cBioPortal/cbioportal/blob/79dc173f271037a9c617ea78ef660fa82f2767df/src/main/resources/org/cbioportal/persistence/mybatisclickhouse/StudyViewFilterMapper.xml#L280 https://github.com/cBioPortal/cbioportal/blob/ceed01f3b907158109c09b7b78b7a1600df3e24b/src/main/java/org/cbioportal/persistence/helper/StudyViewFilterHelper.java#L84-L87
So, for example, we treat the clinical data filter below as a numerical filter because the first filter is numerical.
The SQL dealing with numerical filters only checks for NA
and ignores any other value
.
So, for this specific example we always ignore UNKNOWN
. I guess we need to somehow improve the SQL to handle both numerical and categorical filters at the same time.
The filtering result is identical this one (because we ignore UNKNOWN
): http://localhost:8082/study/summary?id=genie_public#filterJson={"clinicalDataFilters":[{"attributeId":"AGE_AT_SEQ_REPORT","values":[{"start":80,"end":85},{"start":85}]}],"studyIds":["genie_public"],"alterationFilter":{"copyNumberAlterationEventTypes":{"AMP":true,"HOMDEL":true},"mutationEventTypes":{"any":true},"structuralVariants":null,"includeDriver":true,"includeVUS":true,"includeUnknownOncogenicity":true,"includeUnknownTier":true,"includeGermline":true,"includeSomatic":true,"includeUnknownStatus":true,"tiersBooleanMap":{}}}
curl 'http://localhost:8082/api/column-store/filtered-samples/fetch' \ -H 'Accept: application/json' \ -H 'Accept-Language: en-US,en;q=0.9' \ -H 'Cache-Control: no-cache' \ -H 'Connection: keep-alive' \ -H 'Content-Type: application/json' \ -H 'Cookie: _ga=GA1.1.1887007066.1710956751; _ga_5260NDGD6Z=GS1.1.1721930557.11.1.1721930574.0.0.0' \ -H 'Origin: http://localhost:8082' \ -H 'Pragma: no-cache' \ -H 'Referer: http://localhost:8082/study/summary?id=genie_public' \ -H 'Sec-Fetch-Dest: empty' \ -H 'Sec-Fetch-Mode: cors' \ -H 'Sec-Fetch-Site: same-origin' \ -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Safari/537.36' \ -H 'sec-ch-ua: "Chromium";v="130", "Google Chrome";v="130", "Not?A_Brand";v="99"' \ -H 'sec-ch-ua-mobile: ?0' \ -H 'sec-ch-ua-platform: "macOS"' \ --data-raw '{"clinicalDataFilters":[{"attributeId":"CANCER_TYPE_DETAILED","values":[{"value":"BREAST INVASIVE DUCTAL CARCINOMA"}]},{"attributeId":"AGE_AT_SEQ_REPORT","values":[{"start":65,"end":70},{"start":70,"end":75},{"start":75,"end":80}]}],"studyIds":["genie_public"],"alterationFilter":{"copyNumberAlterationEventTypes":{"AMP":true,"HOMDEL":true},"mutationEventTypes":{"any":true},"structuralVariants":null,"includeDriver":true,"includeVUS":true,"includeUnknownOncogenicity":true,"includeUnknownTier":true,"includeGermline":true,"includeSomatic":true,"includeUnknownStatus":true,"tiersBooleanMap":{}}}'