kiwix / kiwix-tools

Command line Kiwix tools: kiwix-serve, kiwix-manage, ...
GNU General Public License v3.0
408 stars 79 forks source link

Question mark in article title caused broser to report "The page isn’t redirecting properly" #589

Closed DarkmatterUAE closed 1 year ago

DarkmatterUAE commented 1 year ago

Accessing wikipedia page "Quo Vadis?" in the suggestion (during search) resulted in Firefox reporting:

The page isn’t redirecting properly Firefox has detected that the server is redirecting the request for this address in a way that will never complete.

Reloading the page with dev tools opened resulted in a massive influx of accesses which received 302 but no actual responce body. Each time I did the testing kiwix-serve (-v) logs 21 repetition of these:

Requesting : 
full_url  : /wikipedia_en_all_nopic_2022-01/A/Quo_vadis
method    : GET (0)
version   : HTTP/1.1
request#  : 20
headers   :
 - accept : 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8'
 - accept-encoding : 'gzip, deflate'
 - accept-language : 'en-US,en;q=0.5'
 - cache-control : 'no-cache'
 - connection : 'keep-alive'
 - cookie : 'filters=lang=eng'
 - dnt : '1'
 - host : 'server-ip:8888'
 - pragma : 'no-cache'
 - referer : 'http://server-ip:8888/?lang=en'
 - upgrade-insecure-requests : '1'
 - user-agent : 'Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0'
arguments :
Parsed : 
full_url: /wikipedia_en_all_nopic_2022-01/A/Quo_vadis
url   : /wikipedia_en_all_nopic_2022-01/A/Quo_vadis
acceptEncodingGzip : 1
has_range : 0
is_valid_url : 1
** running handle_content
Response :
httpResponseCode : 302
headers :
 - Cache-Control: 'no-cache, no-store, must-revalidate'
 - Access-Control-Allow-Origin: '*'
 - Location: '/wikipedia_en_all_nopic_2022-01/A/Quo_vadis?'
Request time : 0.002369s

Tested on binaries distributed via and through alpine linux software repository: version:

libkiwix 11.0.0
+ libzim 8.0.0
+ libxapian 1.4.18
+ libcurl 7.67.0
+ libmicrohttpd 0.9.72
+ libz 1.2.12
+ libicu 58.2.0
+ libpugixml 0.12.0
libzim 8.0.0
+ libzstd 1.5.2
+ liblzma 5.2.4
+ libxapian 1.4.18
+ libicu 58.2.0

alpine linux version:

libkiwix 11.0.0
+ libzim 8.0.0
+ libxapian 1.4.21
+ libcurl 7.86.0
+ libmicrohttpd 0.9.75
+ libz 1.2.13
+ libicu 72.1.0
+ libpugixml 1.12.0
libzim 8.0.0
+ libzstd 1.5.2
+ liblzma 5.2.7
+ libxapian 1.4.21
+ libicu 72.1.0
DarkmatterUAE commented 1 year ago

Some testing of my own:

Digging through the issues give me some insight of how the problem may be, so I replaced the question mark in the url with %3F (URL encode of "?"), and the page loaded corrrectly.

But then I don't understand, how did kiwix-serve suggested me a link that my browser can't load (I had to replace ? with %3F to load the page correctly)? Is there any problem with my zim file (I checked the digest!)? Or is it necessary for kiwix-serve to convert the link given by the zim file somehow?

veloman-yunkan commented 1 year ago

I suspect that this issue boils down to the same root cause as in #587. It should be present in latest release (3.4.0) of kiwix-tools, too, but may be fixed by kiwix/libkiwix#859.

kelson42 commented 1 year ago

@DarkmatterUAE Could you please check if the bug is still there with latest nightly?

DarkmatterUAE commented 1 year ago

No it isn't, the web interface changed a bit (probably due to version change) but I'm still getting the "page isn’t redirecting properly" error. Using kiwix-tools_linux-i586-2022-12-17.tar.gz

libkiwix 12.0.0
+ libzim 8.1.0
+ libxapian 1.4.18
+ libcurl 7.67.0
+ libmicrohttpd 0.9.72
+ libz 1.2.12
+ libicu 58.2.0
+ libpugixml 0.12.0

libzim 8.1.0
+ libzstd 1.5.2
+ liblzma 5.2.6
+ libxapian 1.4.18
+ libicu 58.2.0
veloman-yunkan commented 1 year ago

Well, I believe that kiwix/libkiwix#860 will fix this issue once it is merged.

kelson42 commented 1 year ago

Great, so lets wait and see that the PR is merged and a new nightly available for testing.

DarkmatterUAE commented 1 year ago

Nightly version kiwix-tools_linux-i586-2022-12-23.tar.gz still didn't solve the issue. Is this expected? :thinking:

$./kiwix-serve -V
kiwix-tools 3.4.0

libkiwix 12.0.0
+ libzim 8.1.0
+ libxapian 1.4.18
+ libcurl 7.67.0
+ libmicrohttpd 0.9.72
+ libz 1.2.12
+ libicu 58.2.0
+ libpugixml 0.12.0

libzim 8.1.0
+ libzstd 1.5.2
+ liblzma 5.2.6
+ libxapian 1.4.18
+ libicu 58.2.0
kelson42 commented 1 year ago

@DarkmatterUAE It's not expected! @veloman-yunkan Do we have forgotten something?

veloman-yunkan commented 1 year ago

I found a ZIM file containing an article with a question mark in its name on my side and reproduced the issue (which I lazily, stupidly and complacently hadn't done when writing my earlier comment This time we deal with a set of bugs hiding deeper in the C++ code and addressed by kiwix/libkiwix#775. There is a quick workaround for suggestions, but a similar issue for search results will persist (which needs to be fixed in C++).

DarkmatterUAE commented 1 year ago

Screenshot from 2023-01-07 04-24-20 Showing Too many redirects error from Firefox Browser

It still is not... Using 2023-01-07's nightly build.

kiwix-tools 3.4.0

libkiwix 12.0.0
+ libzim 8.1.0
+ libxapian 1.4.18
+ libcurl 7.67.0
+ libmicrohttpd 0.9.72
+ libz 1.2.12
+ libicu 58.2.0
+ libpugixml 0.12.0

libzim 8.1.0
+ libzstd 1.5.2
+ liblzma 5.2.6
+ libxapian 1.4.18
+ libicu 58.2.0
veloman-yunkan commented 1 year ago

So it turns out that at the time of reporting this ticket was resting on a stack of several bugs. The remaining issue (I hope so) was already mentioned in my previous comment - it's kiwix/libkiwix#775. The reason why my last quick workaround didn't fully solve the issue is because of an extra redirection involved in this case:

kelson42 commented 1 year ago

@veloman-yunkan So we should close this ticket as duplicate of kiwix/libkiwix#775? When this last ticket will be fixed/implemented, then the buggy behaviour reported here will vanish?

veloman-yunkan commented 1 year ago

@kelson42 The buggy behaviour reported here should be eliminated by a rather small PR kiwix/libkiwix#866 though similar issues could be observed until kiwix/libkiwix#775 is fully fixed.

kelson42 commented 1 year ago

@veloman-yunkan Thank you for the fix and we shoukd focus now to fix kiwix/libkiwix#775.