matrix-org / sygnal

Sygnal: reference Push Gateway for Matrix
Apache License 2.0
160 stars 144 forks source link

Improve `glob_to_regex` performance and add case sensitivity flag #268

Closed squahtx closed 2 years ago

squahtx commented 2 years ago

Fixing #255 will involve more frequent usage of glob_to_regex, so it'd be nice if it didn't do so many string concatenations.

erikjohnston commented 2 years ago

Is this something we can also do to the Synapse implementation?

richvdh commented 2 years ago

ARGH. Please can we draw a line in the sand and resurrect syutil, rather than having two or three copies of everything?

squahtx commented 2 years ago

Is this something we can also do to the Synapse implementation?

I didn't know Synapse had an equivalent. Synapse's one seems far more clever and collapses runs of wildcards, which the technique here can't do.

richvdh commented 2 years ago

Synapse's one seems far more clever and collapses runs of wildcards

Turns out that if someone sends you a glob with three * in a row, that DoSes your entire process if you naively convert it to .*.*.*.

That sort of lesson is one of the reasons I'm really keen we pull it out to a shared library, so we don't have to keep learning the same lesson.

squahtx commented 2 years ago

Fixing #255 will involve more frequent usage of glob_to_regex, so it'd be nice if it didn't do so many string concatenations.

I'm not quite sure how — hopefully we're only running a few on startup anyway (since re.compile is expensive), but I can't argue that this is probably a better way of doing it.

I did a direct translation of re.search to glob_to_regex().match. I've put up #269 and #270 for judgement

squahtx commented 2 years ago

Turns out that if someone sends you a glob with three * in a row, that DoSes your entire process if you naively convert it to .*.*.*.

All the globs that Sygnal has to deal with come from the config thankfully. I do agree that we should move glob handling into syutil at some point and ditch this implementation.

reivilibre commented 2 years ago

I wonder if @richvdh feels strongly enough about making reviving syutil a blocker for this or not? There's the usual story of there being nothing more permanent than a temporary solution :P.

Otherwise, I'm happy to accept this now

richvdh commented 2 years ago

I wonder if @richvdh feels strongly enough about making reviving syutil a blocker for this or not?

I guess not, but honestly I feel like we have this discussion every time someone touches some code that ought to be in syutil - "it's done now, let's leave it until next time". At some point we really have to just draw a line.

IOW: I'd be much happier to see this fixed with syutil, but if you just want to get on and land the fixes, I won't cry too much provided people actually promise to use syutil next time.

callahad commented 2 years ago

Let's land it right now with the copy+pasted code and commit to extracting it out before the end of the month.

@richvdh If we don't extract it in time, I'll owe you a tipple of your choice.

squahtx commented 2 years ago

Replaced by #281