fhirbase / fhirbase-plv8

[DEPRECATED] Fhirbase 2.0 is an FHIR relational storage
https://health-samurai.io/fhirbase
Other
105 stars 39 forks source link

Indexing/pagination total count performance issues #116

Closed Andrzej-Pietrzak closed 7 years ago

Andrzej-Pietrzak commented 8 years ago
  1. The total count and paging causes queries with large result sets to be slow - could we discuss adding the parameter or a second search method that would not include the links and total count section at all ?
  2. Some of the searches seem slow - Boolean, Quantity, Code/CodableConcept/Coding
  3. DateTime or Period searches are slow (e.g. Observation.date is datetime or period)

Tested with dataset:

Search times: boolean: Patient?active=false -> 0 res, Time: 81654.428 ms - empty set so count is not an issue Patient?active=true -> 3M res, Time: 77672.724 ms

code:

string: Patient?given=Tim -> 2977 res, Time: 116.572 ms Patient?family=Smith -> 12685 res, Time: 614.471 ms

Identifier: Patient?identifier=155996 -> 3 res, Time: 7.730 ms Patient?identifier=155997 -> 5 res, Time: 7.004 ms Patient?identifier=155998 -> 1 res, Time: 8.804 ms Patient?identifier=155998 -> 1 res, Time: 8.804 ms Patient?identifier=155999 -> 2 res, Time: 6.755 ms Patient?identifier=Hospital Number|155999 -> 2 res, 8.103 ms Patient?identifier=nonexisting|155999 -> 0 res, 13.298 ms Patient?identifier=NHS Number|155999 -> 0 res, 5.728 ms Patient?identifier=NHS Number|4560123456 -> 0 res, 5.868 ms

ContactPoint: Patient?telecom=1-118-942-6952 -> 1 res, Time: 28.317 ms Patient?phone=1-118-942-6952 -> 1 res, 9.165 ms

CodableConcept: Observation?code=kgcb19994-3 -> 46502 res, Time: 23117.381 ms -- maybe count is an issue ? Observation?code=55284-4 -> 3000000 res, Time: 631105.296 ms Observation?code=0asdter -> 0 res, Time: 8.657 ms

Coding (as a single coding): DiagnosticReport?code=10671-6 -> 61086 res, Time: 24070.661 ms - maybe count is an issue ? DiagnosticReport?code=10744-1 -> 60565 res, Time: 22976.348 ms DiagnosticReport?code=10777-1 -> 60608 res, Time: 21693.530 ms DiagnosticReport?code=nothing -> 0 res, Time: 8.648 ms

Reference: Observation?patient=062fac5a-f185-4735-8b6a-08b8f5c7ed83 -> 5 res, Time: 26.207 ms Observation?patient=something -> 0 res, Time: 6.421 ms Patient?_id=062fac5a-f185-4735-8b6a-08b8f5c7ed83 -> 1 res, Time: 18.706 ms Patient?_id=062fac5a-f185-4735-8b6a-08b8f5c7ed83&_include=Observation:patient -> 1 res (and 5 includes), Time: 10.482 ms Obsevation?patient=062fac5a-f185-4735-8b6a-08b8f5c7ed83&_include=patient -> 5 res (and 1 include), Time: 7.778 ms DiagnosticReport?_id=54d30aeb-0ebe-482c-aa6f-8b3c55d581dc&_include=result -> 1 res (and 2 observations), Time: 41.008 ms

Quantity Given 15M observations, 300k are weight measurements in ks's in range of 20-150 Observation?value-quantity=gt150 -> 2001804 res, Time: 455556.565 ms Observation?value-quantity=gt198 -> 80168 res, Time: 30602.466 ms Observation?value-quantity=eq100 -> 156648 res, Time: 61523.574 ms Observation?value-quantity=gt150||kg -> 0 res, Time: 642673.473 ms Observation?value-quantity=gt100||kg -> 509624 res, Time: 380629.512 ms Observation?value-quantity=ge150||kg -> 10120 res, Time: 294714.285 ms Observation?value-quantity=ge150|someNonexistingSys|kg -> 0 res, Time: 377733.673 ms

DateTime: DiagnosticReport?issued=eq2007-09-18T10:59:36.352Z -> 1 res, Time: 80.448 ms DiagnosticReport?issued=gt2017-09-18T10:59:36.352Z -> 0 res, Time: 12.172 ms DiagnosticReport?issued=gt2016-04-01T00:00:00Z -> 46903 res, Time: 9175.543 ms

DateTime or Period: Observation?date=gt2017-05-05T10:00:00 -> 0 res, 615413.026 ms - empty set so count is not an issue Observation?date=eq1993-06-16T20:29:19.188Z -> 1 res, 790026.877 ms

Date: birthdate=eq2010-01-01 -> 190 res, Time: 55.705 ms birthdate=eq1924-01-01 -> 0 res, Time: 8.881 ms birthdate=eq1990-01-01 -> 125 res, Time: 12.198 ms birthdate=gt2012-01-01 -> 185549 res, Time: 21494.050 ms birthdate=gt2016-01-01 -> 13938 res, Time: 68.909 ms birthdate=ge2016-01-01 -> 14064 res, Time: 33.016 ms birthdate=lt1930-01-01 -> 0 res, 6.926 ms birthdate=lt1947-01-01 -> 26812 res, Time: 109.724 ms birthdate=le1947-01-01 -> 26930 res, Time: 113.753 ms birthdate=lt1950-01-01 -> 155457 res, Time: 18612.779 ms birthdate=lt1950-01-01&birthdate=gt1949-01-01 -> 42872 res, Time: 158.717 ms

szymonp-kainos commented 8 years ago

What's the status on this issue ?

manuso commented 8 years ago

The date indexes are not working properly.

The queries like:

/Appointment?_type=Appointment,Encounter&_count=10&date=ge2016-10-09T22:00:00.000Z&date=le2016-10-10T21:59:59.999Z&_sort:asc=date&location=278751fd-0a35-4ddf-a64b-c68fbe1ac750

end up with very poor performance as indexes are not being picked up.

Could the date issues be addressed first?

maksym commented 7 years ago

The date index testing shows that there are some issues with some of the ‘period’ and ‘datetime’ queries that will require some more work. For 2 out of 4 test parameters there were queries that should return 0 results and finish immediately that took a lot of time for an unknown reason.

On encounter example, given encounters between 1970 to 2016, searching with open date ranges (only lt, le, gt, ge) when expecting no results, would take a few minutes to complete:

SELECT fhir_search('{"resourceType": "Encounter", "queryString": "date=lt1910-01-01&_totalMethod=no"}'); SELECT fhir_search('{"resourceType": "Encounter", "queryString": "date=le1911-03-21&_totalMethod=no"}'); SELECT fhir_search('{"resourceType": "Encounter", "queryString": "date=gt2020-01-01&_totalMethod=no"}'); SELECT fhir_search('{"resourceType": "Encounter", "queryString": "date=ge2021-04-05&_totalMethod=no"}');

Adding additional query params would not help. A similar long running query:

SELECT fhir_search('{"resourceType": "Encounter", "queryString": "date=le1911-03-21&organization=a1bfe328-a775-4a82-8f88-b1608bc589f5&_totalMethod=no"}');

On the other hand adding the upper or lower bound partially helps :

SELECT fhir_search('{"resourceType": "Encounter", "queryString": "date=gt1910-12-01&date=lt1920-01-01"}'); -- Time: 321.489 ms, 0 res SELECT fhir_search('{"resourceType": "Encounter", "queryString": "date=gt2019-12-01&date=lt2020-01-01"}'); -- Time: 5.817 ms, 0 res

I’ve tested this on a date (patient.birthdate) an instant (appointment.date) a period (encounter.date) and a dateTime (observation.date). There were no problems with date nor instant, this can be observed only on the period and the date time.

fhirbase-date-index-testing.txt

maksym commented 7 years ago

SET plv8.start_proc = 'plv8_init'; SELECT fhir_index_parameter('{"resourceType": "Encounter", "name": "date"}'); VACUUM ANALYZE encounter; SELECT fhir_search('{"resourceType": "Encounter", "queryString": "date=lt1910-01-01&_totalMethod=no"}');

SELECT fhir_search('{"resourceType": "Encounter", "queryString": "date=lt1910-01-01&_totalMethod=no"}'); SELECT fhir_search('{"resourceType": "Encounter", "queryString": "date=le1911-03-21&_totalMethod=no"}'); SELECT fhir_search('{"resourceType": "Encounter", "queryString": "date=gt2020-01-01&_totalMethod=no"}'); SELECT fhir_search('{"resourceType": "Encounter", "queryString": "date=ge2021-04-05&_totalMethod=no"}'); SELECT fhir_search('{"resourceType": "Encounter", "queryString": "date=gt1910-12-01&date=lt1920-01-01"}'); SELECT fhir_search('{"resourceType": "Encounter", "queryString": "date=gt2019-12-01&date=lt2020-01-01"}');

EXPLAIN ANALYZE SELECT tbl1.* FROM encounter tbl1 WHERE fhir_extract_as_epoch_lower( ( resource )::json , ( '[{"path":["Encounter","period"],"elementType":"Period"}]' )::json ) < extract(epoch from ('1910-01-01')::timestamp with time zone)::double precision LIMIT 10;

EXPLAIN ANALYZE SELECT tbl1.* FROM encounter tbl1 WHERE fhir_extract_as_epoch_upper( ( resource )::json , ( '[{"path":["Encounter","period"],"elementType":"Period"}]' )::json ) > extract(epoch from ('2020-01-01')::timestamp with time zone)::double precision LIMIT 10;

Andrzej-Pietrzak commented 7 years ago

I retested these date searches again with the same data and it seems to work fine this time.

I think there's a chance I forgot to force garbage collection after dropping one data set and ingesting a new one. So the problem is reproducible when you terminate a storage, create a new one, ingest data and then apply indexes again, but it disappears after calling VACUUM ANALYZE. So I'm guessing thats just user error on my side.