internetarchive / openlibrary

One webpage for every book ever published!
https://openlibrary.org
GNU Affero General Public License v3.0
4.98k stars 1.25k forks source link

Carousels don't support `'` character in subjects #9036

Open RayBB opened 3 months ago

RayBB commented 3 months ago

Problem

I tried to make a carousel like this: {{QueryCarousel('subject:("New York Times bestseller" AND "Children's fiction")', title="New York Times Best Seller Children's Fiction", sort='random.hourly', key="nyt_childrens_fiction")}}

But it doesn't work. and escaping the ' with a \' also didn't work.

Evidence / Screenshot

https://openlibrary.org/collections/nyt-bestsellers

Relevant URL(s)

Reproducing the bug

  1. Go to ...
  2. Do ...

Context

Notes from this Issue's Lead

Proposal & constraints

Related files

Stakeholders

benbdeitch commented 2 months ago

I'd be happy to work on this, if you could assign me to it!

RayBB commented 2 months ago

@benbdeitch I think this could be a relatively tricky issue (@cdrini will have more of an opinion than me) but you can go ahead and try if you'd like.

benbdeitch commented 2 months ago

So, the root of the problem lies in how we currently evaluate macro arguments, and is tied up with Web.py's templates, of all things. I'll provide further updates as I go.

RayBB commented 2 months ago

@benbdeitch that sounds about right. You might be able to write a test for this in the query utils (if that's the right path) to get going as well https://github.com/internetarchive/openlibrary/blob/bee8ab4e369a500e1e31f73d3f06fb88e7c5896f/openlibrary/solr/query_utils.py#L234

benbdeitch commented 2 months ago

So far as I can tell, query_utils don't actually feature anywhere in the problem; the issue is arising when, in the process of web.py tokenizing the code. That's where we're hitting the 'end of file' error, though I'm still investigating why commenting out the apostrophe doesn't fix the issue.

tfmorris commented 2 months ago

What version of web.py are you using? The latest release (v0.70) is affected by https://github.com/webpy/webpy/issues/784 which sounds like it could be relevant.

benbdeitch commented 2 months ago

This looks extremely relevant, and possibly exactly the problem that's going on. It still doesn't fully explain to me why the syntactically correct (when properly escaped) python code isn't getting properly parsed, though, so that's another manner.

benbdeitch commented 2 months ago

I'm somewhat convinced now that this could be exactly what's going on, since the fix that handles that issue only addresses " characters, and not the single quotes.

benbdeitch commented 2 months ago

image Given this, I'm not certain if we should try to evaluate the macro inputs in a different manner; relying on something that ultimately uses this code doesn't feel like a great idea.

At the same time, I recognize that I don't fully understand the reason why we're creating a web.py template to evaluate the macro's args in, in the first place. Might changing the approach entirely be worth discussing?

tfmorris commented 2 months ago

[For those following along, the image above comes from https://docs.python.org/3.12/library/tokenize.html]

since the fix that handles that issue only addresses " characters, and not the single quotes.

I think you're referring to an earlier provisional fix which wasn't actually used in the end. The code from the actual PR, which doesn't include any character specific handling, is here:

https://github.com/webpy/webpy/pull/785/files#diff-1ab8f1b1d3ea7f4df92d7266d537570f5b631e382ddbef3ebce2bf2dfa71d566R335-R350

It still doesn't fully explain to me why the syntactically correct (when properly escaped) python code isn't getting properly parsed

It's been a while since I looked at it, but I seem to remember that some things are parsed piecewise, which may cause issues in this particular scenario. Swapping the usage of ' and " and then escaping the inner ' might provide a workaround, if this is part of the problem.

I recognize that I don't fully understand the reason why we're creating a web.py template to evaluate the macro's args in

This discussion would be a lot easier to follow with a reference to the code involved.

If you come up with a failing test that can be added here, I'd be happy to look at it:

https://github.com/webpy/webpy/blob/d3649322b85777b291ac2b7b3699fb6fc839e382/tests/test_template.py#L53-L56

benbdeitch commented 2 months ago

Wonderful, thanks! I'll try to get that test for you, asap, and I'll also give that suggestion a shot.

The actual code that is failing is in the `infogami.utils.macro.py' file, specifically the 'safeeval_args' function. Sorry for the earlier vagarities, I incorrectly thought I had already mentioned it.

tfmorris commented 2 months ago

I'll also give that suggestion a shot

I already tried that and it didn't help. You can see my experiments in the history at the page in the original issue (it wasn't clear to me before that it contained the failing example). The failure mode is apparently that it returns the original string rather than the results of the macro invocation, but doesn't provide any error information or traceback.