openzim / zim-requests

Want a new ZIM file? Propose ZIM content improvements or fixes? Here you are!
https://farm.openzim.org
34 stars 2 forks source link

athena_fr_all is broken #1029

Open benoit74 opened 2 weeks ago

benoit74 commented 2 weeks ago

ZIM(s) location

https://library.kiwix.org/viewer#athena_fr_all_2024-05

Recipe(s) URL

https://farm.openzim.org/recipes/athena_fr_all/

Readers tested

Both ZIM versions impacted?

Yes, both versions are impacted

Details

All pages on https://library.kiwix.org/content/athena_fr_all_2024-05/A/athena.unige.ch/athena/mineral/min_lists.html are broken.

The search pages for minerals and for athena e-texts are also not working.

We should probably fix this.

Jaifroid commented 2 weeks ago

The search pages for minerals and for athena e-texts are also not working. We should probably fix this.

@benoit74 Given that Webrecorder records a visit to a site (Request-Response pairs), it seems difficult to fix search, since the crawler cannot anticipate search Requests in order to record the Responses. We might be lucky if the search technology is client-side, and requests can be piped generically to the underlying JS, but generally ISTM that if the precise POST Request (or querystring-based Request) generated by the user has not been recorded during scraping, it is likely to return 404 from the ZIM asset lookup. Do you concur? Do we have any examples of search based on random user input working in ZImit-style ZIMs?

benoit74 commented 2 weeks ago

I concur. The usual fix consists in hidding the search bar with custom CSS, this is what I meant. And I also agree it is typical and should not totally surprise the user.

I think I've already found one example of search working inside the ZIM because full client-side, but tbh I do not remember which ZIM this is. I should have noted this, it is interesting to show you're right.

Jaifroid commented 2 weeks ago

I don't know for sure, but I think search could only work if client-side code creates a URL (possibly with querystring rather than with POST) that exactly matches a URL that has been scraped because it is also discoverable via hyperlinking...

benoit74 commented 1 week ago

I've updated the ZIM to Zimit2, problem are however still there