Extremely slow query that generates an out of memory error

mihaicosareanu-bolt commented 3 years ago

Describe the bug

I have found a query that brings pelias-api to its knees. I'm currently trying to understand the code behind the Tokenizer and ExclusiveCartesianSolver, but I thought I should give you a heads up as well, maybe you will understand much faster what happens. I've tried other queries with multiple tokens but none behave as bad as this one :).

Steps to Reproduce (link redacted for somewhat obvious reasons. Feel free to look at the edit history if you are interested but take care to not hurt your own or others' Pelias servers. Not everyone has patched yet 😁)

What I've noticed is that in a previous pelias version (tried a commit from 29 Jun 2020) it works fast. So there were some changes in between that have affected this (on my machine the latest master returns in 22 seconds vs 800 ms in the 2020 June commit).

Expected behavior Should return as fast as other queries with multiple tokens do.

Environment (please complete the following information): Linux & Mac

Pastebin/Screenshots

Additional context

References

orangejulius commented 3 years ago

Hi @mihaicosareanu-bolt, Thanks for reporting this, it explains some slow queries we've been seeing lately :)

I do think this came in as part of https://github.com/pelias/api/pull/1452, where we upgraded the parser a bunch.

I'm also not super familiar with the parser code, and that pull request brought in quite a few changes, so it could be hard to figure out exactly what broke it or how to fix it. But we'll be looking as well, let us know if you find anything.

mihaicosareanu-bolt commented 3 years ago

Thanks for looking into it! I will spend more time on it next week :)

orangejulius commented 3 years ago

Alright, we've tracked it down to a PR to add unit number support (https://github.com/pelias/parser/pull/87). Pelias doesn't fully support unit numbers at this time, so that functionality was not really doing much, as far as I know.

As a stopgap measure, this branch of pelias/parser which removes that functionality appears to be working well.

mihaicosareanu-bolt commented 3 years ago

Thanks a lot Julius! Indeed, tested on my machine and it works without that code. Do you plan on retiring this bit until we have proper support of unit numbers? What are your thoughts?

orangejulius commented 3 years ago

I'm not 100% sure what we'll do yet, though we'll definitely put in a fix one way or another. I'd like to discuss with @missinglink first, but he's in the middle of some fairly disruptive travel, so it will be a few days before we can chat about it.

mihaicosareanu-bolt commented 3 years ago

Great! Looking forward to your conclusion :). Safe travels @missinglink :D

orangejulius commented 3 years ago

Okay, we've just put in a 'hotfix' for this issue, and it should be effectively resolved with no regressions in result quality.

After talking with @missinglink there are a bunch of different ways we can put in a long term fix, which we can discuss here.

For now though, anyone using the master branch of the Pelias API shouldn't see any slow parses or out of memory issues.

pelias / api

Extremely slow query that generates an out of memory error #1530

References