Closed jodysalt closed 7 years ago
This is still an issue in production.
https://helpdesk.gridpp.rl.ac.uk:443/Ticket/Display.html?id=174864
I have reproduced the bug - now I need to fix it
In fact it is not a bug in ICAT. The problem is that the internal query (i.e. without the limit clause) does not provide a fully defined ordering so when used with different limit clauses does not work as expected. The solution I adopted was to change the order by from "ORDER BY df.datafileCreateTime desc" to "ORDER BY df.datafileCreateTime desc, id". This is still able to use the index on the datafileCreateTime. Note that this field is not generally indexed so consideration should be given to just using "order by id desc" which also works fine.
This is for icatapi-preprod.diamond.ac.uk (4.6.1)
When paginating the results I get the following:
(The only difference between these two queries is "ORDER BY datafile.datafileCreateTime desc")
query1: select distinct datafile from Datafile datafile , datafile.dataset as dataset , dataset.investigation as investigation , investigation.facility as facility where facility.id = 1 and investigation.name = 'NT11175' and investigation.id = 1010754770 and dataset.id = 1083896358 and UPPER(datafile.name) like concat('%', 'M8S8_2', '%') ORDER BY datafile.datafileCreateTime desc limit [offset], 50
query1: returned ids count: 1401 query1: unique returned ids count: 1347
query2: select distinct datafile from Datafile datafile , datafile.dataset as dataset , dataset.investigation as investigation , investigation.facility as facility where facility.id = 1 and investigation.name = 'NT11175' and investigation.id = 1010754770 and dataset.id = 1083896358 and UPPER(datafile.name) like concat('%', 'M8S8_2', '%') limit [offset], 50
query2: returned ids count: 1401 query2: unique returned ids count: 1401
ids with duplication: [1083931817, 1083931614, 1083930734, 1083930628, 1083930550, 1083931083, 1083931680, 1083931311, 1083931276, 1083930670, 1083931632, 1083931228, 1083930900, 1083930784, 1083931468, 1083930620, 1083931230, 1083930529, 1083930702, 1083930607, 1083930498, 1083931601, 1083931793, 1083930897, 1083931657, 1083930474, 1083931545, 1083931159, 1083931113, 1083931634, 1083930659, 1083931633, 1083930944, 1083931096, 1083931611, 1083930850, 1083931354, 1083931167, 1083931142, 1083930668, 1083930854, 1083930747, 1083931712, 1083930542, 1083930633, 1083930721, 1083931800, 1083931154, 1083931638, 1083931534, 1083931347, 1083931071, 1083931394, 1083931688]
sample entities with duplication {"Datafile"=> {"id"=>1083931817, "createId"=>"GUARDIAN", "createTime"=>"2015-07-20T22:22:57.298+01:00", "modId"=>"GUARDIAN", "modTime"=>"2015-07-20T22:22:57.298+01:00", "dataCollectionDatafiles"=>[], "datafileCreateTime"=>"2015-07-20T22:14:49.000+01:00", "datafileModTime"=>"2015-07-20T22:14:49.000+01:00", "description"=>"unknown", "destDatafiles"=>[], "fileSize"=>6230661, "location"=> "/dls/i04/data/2015/nt11175-56/pasdomain/pasdomain_M8S8_2_1350.cbf", "name"=>"pasdomain_M8S8_2_1350.cbf", "parameters"=>[], "sourceDatafiles"=>[]}}
duplication count: 2
{"Datafile"=> {"id"=>1083931614, "createId"=>"GUARDIAN", "createTime"=>"2015-07-20T22:22:56.716+01:00", "modId"=>"GUARDIAN", "modTime"=>"2015-07-20T22:22:56.716+01:00", "dataCollectionDatafiles"=>[], "datafileCreateTime"=>"2015-07-20T22:14:49.000+01:00", "datafileModTime"=>"2015-07-20T22:14:49.000+01:00", "description"=>"unknown", "destDatafiles"=>[], "fileSize"=>6230643, "location"=> "/dls/i04/data/2015/nt11175-56/pasdomain/pasdomain_M8S8_2_1349.cbf", "name"=>"pasdomain_M8S8_2_1349.cbf", "parameters"=>[], "sourceDatafiles"=>[]}}
duplication count: 2