dronefly-garden / dronefly

Red Discord Bot V3 cogs for naturalists.
Other
16 stars 3 forks source link

taxon: handle apostrophe (mismatched quote) #117

Closed synrg closed 4 years ago

synrg commented 4 years ago

The new parser has issues with apostrophes in taxon queries. Mason triggered this on iNat Discord with a search of ,taxon elephant's head. The workaround for now is to use double-quotes, e.g. ,taxon "elephant's head" but the user should not have to type this. In the log I see:

[2020-07-22 09:13:16] [ERROR] red: ConversionError
Traceback (most recent call last):
  File "/home/synrg/.local/lib/python3.8/site-packages/discord/ext/commands/core.py", line 430, in _actual_conversion
    ret = await method(ctx, argument)
  File "/home/synrg/.local/share/Appledore/cogs/CogManager/cogs/inatcog/converters.py", line 205, in convert
    return await super(NaturalCompoundQueryConverter, cls).convert(
  File "/home/synrg/.local/share/Appledore/cogs/CogManager/cogs/inatcog/converters.py", line 152, in convert
    terms, phrases, code = detect_terms_phrases_code(vals.main)
  File "/home/synrg/.local/share/Appledore/cogs/CogManager/cogs/inatcog/converters.py", line 108, in detect_terms_phrases_code
    terms = shlex.split(" ".join(list(terms_and_phrases)))
  File "/usr/local/lib/python3.8/shlex.py", line 311, in split
    return list(lex)
  File "/usr/local/lib/python3.8/shlex.py", line 300, in __next__
    token = self.get_token()
  File "/usr/local/lib/python3.8/shlex.py", line 109, in get_token
    raw = self.read_token()
  File "/usr/local/lib/python3.8/shlex.py", line 191, in read_token
    raise ValueError("No closing quotation")
ValueError: No closing quotation

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/synrg/.local/lib/python3.8/site-packages/discord/ext/commands/bot.py", line 892, in invoke
    await ctx.command.invoke(ctx)
  File "/home/synrg/.local/lib/python3.8/site-packages/discord/ext/commands/core.py", line 790, in invoke
    await self.prepare(ctx)
  File "/home/synrg/.local/lib/python3.8/site-packages/redbot/core/commands/commands.py", line 471, in prepare
    await self._parse_arguments(ctx)
  File "/home/synrg/.local/lib/python3.8/site-packages/discord/ext/commands/core.py", line 679, in _parse_arguments
    kwargs[name] = await self.transform(ctx, param)
  File "/home/synrg/.local/lib/python3.8/site-packages/discord/ext/commands/core.py", line 526, in transform
    return await self.do_conversion(ctx, converter, argument, param)
  File "/home/synrg/.local/lib/python3.8/site-packages/redbot/core/commands/commands.py", line 497, in do_conversion
    return await super().do_conversion(ctx, converter, argument, param)
  File "/home/synrg/.local/lib/python3.8/site-packages/discord/ext/commands/core.py", line 479, in do_conversion
    return await self._actual_conversion(ctx, converter, argument, param)
  File "/home/synrg/.local/lib/python3.8/site-packages/discord/ext/commands/core.py", line 438, in _actual_conversion
    raise ConversionError(converter, exc) from exc
discord.ext.commands.errors.ConversionError: (<class 'inatcog.converters.NaturalCompoundQueryConverter'>, ValueError('No closing quotation'))
synrg commented 4 years ago

In inatcog/converters.py, we use this "cheat". To handle the transform of terms plus phrases (a list produced by shlex.split with posix=false so double-quotes are kept) to just terms:

Unforunately, the cheat causes the bug. Unbalanced single-quotes aren't Posix-compliant. Since we aren't actually processing a Posix-compliant commandline, we shouldn't be cutting corners like this!

A correct implementation would be to map the list of terms & phrases, dropping leading/trailing double-quotes.

But even posix=false doesn't save us from potential issues here. If you have a query string with a single unmatched double-quote in it, that will break, too. Arguably, though, that is what we would like to happen (though the raised exception needs to be caught and handled, rather than just dying and throwing errors into the log).

Now that I think of it, this sounds suspiciously like the bug on aliases in Red core itself that caused us grief when a single unmatched curly-apostrophe appears in an aliased command's arguments. I wonder if shlex is at the root of that issue, too? This bears further investigation, as whatever we come up with here might help solve that issue, too.

synrg commented 4 years ago

Looking at the shlex doc, in addition to always using posix=false, instead of using split (which creates an instance) we could create an instance manually, then set escape="" on it (i.e. no escape) and finally, set quotes to only include the double-quote, instead of also single-quotes. After that, I guess we could catch the unbalanced quotes issue and retry the parse with a double-quote added to the end of the string if that corner case is encountered (i.e. automatically balance the double-quotes, making the string into a phrase).

If that solves the apostrophe issue, then we can continue to use the posix=true hack to unwrap the quotes. Otherwise, the map and substitute approach on each phrase should work, though it's not quite as elegant.

synrg commented 4 years ago

Fixed in 222c64169607ed951d139d2b87f9d9a650fb328b. I went a different way with it, though.