rnewson / couchdb-lucene

Enables full-text searching of CouchDB documents using Lucene
Apache License 2.0
769 stars 147 forks source link

Sorting on string is not lexicographical #285

Closed hmdevelopermind closed 4 years ago

hmdevelopermind commented 4 years ago

Hi couch team, @rnewson I am trying to get the sorting works on strings in couchdb. so I have some string dates: 2017-11-16T12:55:28.101Z 2019-11-16T12:55:51.76Z 2018-11-16T12:55:28.883Z 2019-11-16T12:59:17.034Z so after I call the couch using luecene sort for example: limit=10&sort=-last_observed I expect to see 2019-11-16T12:59:17.034Z 2019-11-16T12:55:51.76Z 2018-11-16T12:55:28.883Z 2017-11-16T12:55:28.101Z but I do not see the result in that order. Does couch use any other logic for sorting?

Update: as soon as I change date format from 2019-11-16T12:59:17.034Z to 2019-11-16T12-59-17.034Z which means changing : to - it works so seems that couch is not able to sort lexicographically when there is a : in the string

Here is the full url I use:

http://localhost:5984/uds_e84f522c602fa1f67bb5fe4350000384/_design/searchAll/_search/searchAll?q=:&limit%3D5%26sort%3Dlast_observed%3Cstring%3E&include_docs=true

and here is the response I get:

{"total_rows":4,"bookmark":"g1AAAABteJzLYWBgYMpgTmEQTM4vTc5ISXIwNDLXMwBCwxyQVCJDUv3___-zMpjc7D8wgEEiAx71eSwgJQ1A6j-6NqYsAOfjHAQ","rows":[{"id":"e84f522c602fa1f67bb5fe4350001b5d","order":[1.0,0],"fields":{"process.binary_ref.parent_directory_ref.path":"UNDEFINED","process.opened_connection_refs.dst_ref.value":"UNDEFINED","user_account.user_id":"John","file.hashes.MD5":"ad7b9c14083b52bc532fba5948342b98","network_traffic.src_ref.value":"UNDEFINED","process.binary_ref.name":"UNDEFINED","directory.path":"C:\\Windows\\SysWOW64","first_observed":"2019-11-16T12:59:17.034Z","process.opened_connection_refs.src_ref.value":"UNDEFINED","last_observed":"2019-11-16T12:59:17.034Z","process.opened_connection_refs.protocols":"UNDEFINED","process.opened_connection_refs.dst_port":"-1","number_observed":"1","process.opened_connection_refs.src_port":"-1","process.creator_user_ref.user_id":"UNDEFINED","process.created":"UNDEFINED","process.name":"UNDEFINED","network_traffic.dst_port":"-1","network_traffic.dst_ref.value":"UNDEFINED","created_by_ref":"identity--5cd620db-2254-4aca-beb2-f80a4f90eaac","network_traffic.src_port":"-1","file.name":"cmd.exe","ipv4_addr.value":"UNDEFINED","network_traffic.protocols":"UNDEFINED","file.parent_directory_ref.path":["C:\\Windows\\SysWOW64","C:\\Windows\\SysWOW64"],"process.pid":"UNDEFINED"},"doc":{"_id":"e84f522c602fa1f67bb5fe4350001b5d","_rev":"1-96fa85a6f5bcf2bec34a2b4090dbbdcc","type":"observed-data","id":"observed-data--aa498f89-917b-4634-adda-46a373536ea7","created_by_ref":"identity--5cd620db-2254-4aca-beb2-f80a4f90eaac","created":"2020-01-08T21:16:41.132Z","modified":"2020-01-08T21:16:41.132Z","first_observed":"2019-11-16T12:59:17.034Z","last_observed":"2019-11-16T12:59:17.034Z","number_observed":1,"objects":{"0":{"type":"directory","path":"C:\\Windows\\SysWOW64"},"1":{"type":"file","hashes":{"MD5":"ad7b9c14083b52bc532fba5948342b98"},"name":"cmd.exe","parent_directory_ref":"0"},"2":{"type":"user-account","user_id":"John"}}}},{"id":"e84f522c602fa1f67bb5fe43500009c2","order":[1.0,0],"fields":{"process.binary_ref.parent_directory_ref.path":["C:\\Windows\\System32","C:\\Windows\\System32"],"process.opened_connection_refs.dst_ref.value":"192.168.1.1","user_account.user_id":"SYSTEM","file.hashes.MD5":"UNDEFINED","network_traffic.src_ref.value":"192.168.1.156","process.binary_ref.name":["svchost.exe","svchost.exe"],"directory.path":"C:\\Windows\\System32","first_observed":"2019-11-16T12:55:28.101Z","process.opened_connection_refs.src_ref.value":"192.168.1.156","last_observed":"2017-11-16T12:55:28.101Z","process.opened_connection_refs.protocols":["tcp","tcp","tcp","tcp"],"process.opened_connection_refs.dst_port":["47413","47413","47413","47413"],"number_observed":"1","process.opened_connection_refs.src_port":["60842","60842","60842","60842"],"process.creator_user_ref.user_id":["SYSTEM","SYSTEM"],"process.created":"2022-11-16T12:55:28.101Z","process.name":"svchost.exe","network_traffic.dst_port":"47413","network_traffic.dst_ref.value":"192.168.1.1","created_by_ref":"identity--5cd620db-2254-4aca-beb2-f80a4f90eaac","network_traffic.src_port":"60842","file.name":"svchost.exe","ipv4_addr.value":"192.168.1.1","network_traffic.protocols":"tcp","file.parent_directory_ref.path":["C:\\Windows\\System32","C:\\Windows\\System32"],"process.pid":"1380"},"doc":{"_id":"e84f522c602fa1f67bb5fe43500009c2","_rev":"1-5ae741ca75f6b91ce4dabf1961b69f4f","type":"observed-data","id":"observed-data--51d886d7-397b-4ab8-acb2-201d4ad5a303","created_by_ref":"identity--5cd620db-2254-4aca-beb2-f80a4f90eaac","created":"2020-01-08T21:16:40.679Z","modified":"2020-01-08T21:16:40.679Z","first_observed":"2019-11-16T12:55:28.101Z","last_observed":"2017-11-16T12:55:28.101Z","number_observed":1,"objects":{"0":{"type":"directory","path":"C:\\Windows\\System32"},"1":{"type":"file","name":"svchost.exe","parent_directory_ref":"0"},"2":{"type":"user-account","user_id":"SYSTEM"},"3":{"type":"ipv4-addr","value":"192.168.1.156"},"4":{"type":"ipv4-addr","value":"192.168.1.1"},"5":{"type":"network-traffic","src_ref":"3","dst_ref":"4","src_port":60842,"dst_port":47413,"protocols":["tcp"]},"6":{"type":"process","pid":1380,"name":"svchost.exe","created":"2022-11-16T12:55:28.101Z","opened_connection_refs":["5"],"creator_user_ref":"2","binary_ref":"1"}}}},{"id":"e84f522c602fa1f67bb5fe4350000aec","order":[1.0,1],"fields":{"process.binary_ref.parent_directory_ref.path":["C:\\Windows\\System32","C:\\Windows\\System32"],"process.opened_connection_refs.dst_ref.value":"127.0.0.1","user_account.user_id":"LOCAL SERVICE","file.hashes.MD5":"UNDEFINED","network_traffic.src_ref.value":"127.0.0.1","process.binary_ref.name":["svchost.exe","svchost.exe"],"directory.path":"C:\\Windows\\System32","first_observed":"2019-11-16T12:55:28.883Z","process.opened_connection_refs.src_ref.value":"127.0.0.1","last_observed":"2018-11-16T12:55:28.883Z","process.opened_connection_refs.protocols":["tcp","tcp","tcp","tcp"],"process.opened_connection_refs.dst_port":["5357","5357","5357","5357"],"number_observed":"1","process.opened_connection_refs.src_port":["60843","60843","60843","60843"],"process.creator_user_ref.user_id":["LOCAL SERVICE","LOCAL SERVICE"],"process.created":"2019-11-16T12:55:28.883Z","process.name":"svchost.exe","network_traffic.dst_port":"5357","network_traffic.dst_ref.value":"127.0.0.1","created_by_ref":"identity--5cd620db-2254-4aca-beb2-f80a4f90eaac","network_traffic.src_port":"60843","file.name":"svchost.exe","ipv4_addr.value":"127.0.0.1","network_traffic.protocols":"tcp","file.parent_directory_ref.path":["C:\\Windows\\System32","C:\\Windows\\System32"],"process.pid":"1352"},"doc":{"_id":"e84f522c602fa1f67bb5fe4350000aec","_rev":"1-6eedd46398451bbcc166f500343d6359","type":"observed-data","id":"observed-data--840fcf8b-cbe1-4ca3-acae-58835e2f807b","created_by_ref":"identity--5cd620db-2254-4aca-beb2-f80a4f90eaac","created":"2020-01-08T21:16:40.688Z","modified":"2020-01-08T21:16:40.688Z","first_observed":"2019-11-16T12:55:28.883Z","last_observed":"2018-11-16T12:55:28.883Z","number_observed":1,"objects":{"0":{"type":"directory","path":"C:\\Windows\\System32"},"1":{"type":"file","name":"svchost.exe","parent_directory_ref":"0"},"2":{"type":"user-account","user_id":"LOCAL SERVICE"},"3":{"type":"ipv4-addr","value":"127.0.0.1"},"4":{"type":"ipv4-addr","value":"127.0.0.1"},"5":{"type":"network-traffic","src_ref":"3","dst_ref":"4","src_port":60843,"dst_port":5357,"protocols":["tcp"]},"6":{"type":"process","pid":1352,"name":"svchost.exe","created":"2019-11-16T12:55:28.883Z","opened_connection_refs":["5"],"creator_user_ref":"2","binary_ref":"1"}}}},{"id":"e84f522c602fa1f67bb5fe4350000d44","order":[1.0,2],"fields":{"process.binary_ref.parent_directory_ref.path":["C:\\Windows\\System32","C:\\Windows\\System32"],"process.opened_connection_refs.dst_ref.value":"172.16.0.100","user_account.user_id":"LOCAL SERVICE","file.hashes.MD5":"UNDEFINED","network_traffic.src_ref.value":"239.255.255.250","process.binary_ref.name":["svchost.exe","svchost.exe"],"directory.path":"C:\\Windows\\System32","first_observed":"2019-11-16T12:55:51.76Z","process.opened_connection_refs.src_ref.value":"239.255.255.250","last_observed":"2019-11-16T12:55:51.76Z","process.opened_connection_refs.protocols":["udp","udp","udp","udp"],"process.opened_connection_refs.dst_port":["63519","63519","63519","63519"],"number_observed":"1","process.opened_connection_refs.src_port":["1900","1900","1900","1900"],"process.creator_user_ref.user_id":["LOCAL SERVICE","LOCAL SERVICE"],"process.created":"2021-11-16T12:55:51.76Z","process.name":"svchost.exe","network_traffic.dst_port":"63519","network_traffic.dst_ref.value":"172.16.0.100","created_by_ref":"identity--5cd620db-2254-4aca-beb2-f80a4f90eaac","network_traffic.src_port":"1900","file.name":"svchost.exe","ipv4_addr.value":"172.16.0.100","network_traffic.protocols":"udp","file.parent_directory_ref.path":["C:\\Windows\\System32","C:\\Windows\\System32"],"process.pid":"1728"},"doc":{"_id":"e84f522c602fa1f67bb5fe4350000d44","_rev":"1-865a96ca2678e196219a8d31e78a03ed","type":"observed-data","id":"observed-data--8418c73d-a3b3-47d0-b086-d529516c9634","created_by_ref":"identity--5cd620db-2254-4aca-beb2-f80a4f90eaac","created":"2020-01-08T21:16:40.701Z","modified":"2020-01-08T21:16:40.701Z","first_observed":"2019-11-16T12:55:51.76Z","last_observed":"2019-11-16T12:55:51.76Z","number_observed":1,"objects":{"0":{"type":"directory","path":"C:\\Windows\\System32"},"1":{"type":"file","name":"svchost.exe","parent_directory_ref":"0"},"2":{"type":"user-account","user_id":"LOCAL SERVICE"},"3":{"type":"ipv4-addr","value":"239.255.255.250"},"4":{"type":"ipv4-addr","value":"172.16.0.100"},"5":{"type":"network-traffic","src_ref":"3","dst_ref":"4","src_port":1900,"dst_port":63519,"protocols":["udp"]},"6":{"type":"process","pid":1728,"name":"svchost.exe","created":"2021-11-16T12:55:51.76Z","opened_connection_refs":["5"],"creator_user_ref":"2","binary_ref":"1"}}}}]}

So as you see the last_observed is not sorted properly

hmdevelopermind commented 4 years ago

Thanks to couch team suggestion in slack channel fixed by using keyword analizer