Closed jamulix closed 3 years ago
Thanks for the details and the response file. @zmwangx please have a look.
Yes, this is wrongly closed tag soup, can't even prettier it:
$ prettier -w googler-response-3vjkj5_r.html
googler-response-3vjkj5_r.html
[error] googler-response-3vjkj5_r.html: SyntaxError: Unexpected closing tag "body". It may happen when the tag has already been closed by another tag. For more info see https://www.w3.org/TR/html5/syntax.html#closing-elements-that-have-implied-end-tags (1067:346)
[error] 1065 | ]
[error] 1066 | ]
[error] > 1067 | , sideChannel: {}});</script><script id="wiz_jd" nonce="bzE0bpnLpLEKxNbvHL7Y6w">if (window['_wjdc']) {const wjd = {}; window['_wjdc'](wjd); delete window['_wjdc'];}</script><script aria-hidden="true" nonce="bzE0bpnLpLEKxNbvHL7Y6w">window.wiz_progress&&window.wiz_progress(); window.stopScanForCss&&window.stopScanForCss(); ccTick('bl');</script></body></html>
[error] | ^^^^^^^
I'll see what I can do other than introducing a full blown HTML5 parser later.
Wait it's actually some kind of notice...
So the good news is we don't actually need to successfully parse this tag soup, as the content is meaningless. The bad news is I'm not sure how to come up with a way to get around this when I don't even get it in the first place.
Probably need user contribution.
I believe this is some kind of a user consent prompt which we can't parse.
This seems German. Is it possible to google in the regular browser?
Translated:
It's just some annoying cookie consent crap. Some EU user needs figure out what cookie to add to suppress this.
Clicking on "I agree" takes you to https://consent.google.com, which sets a cookie like this along with a 303 redirect, FWIW:
set-cookie: NID=214=DeX...<long string omitted>..._qE; expires=Tue, 26-Oct-2021 03:35:42 GMT; path=/; domain=.google.com; Secure; HttpOnly; SameSite=none
On NID cookie: https://policies.google.com/technologies/cookies?hl=en-US#:~:text=For%20example%2C%20most,user%E2%80%99s%20last%20use.
I am based in germany and am seeing the same thing
I don't think we can do anything here. Closing the issue.
Since April 1 I always get the same error message when searching for News with googler version 4.3.2 and Python version 3.8.5. With or without noprompt option (--np). And it does not depend on the Python 3 Version as well. It's the same with Python 3.7 . I suppose, the reason is Google giving a different answer format since April 1 (Day of first occurrence). The Google answer format might also be different in Europe and Asia.
Example:
Debug output:
[DEBUG] Response body written to '/tmp/googler-response-3vjkj5_r.html' is here: response.zip
code snippet: /usr/local/bin/googler", around line 705*
Ubuntu 20.04 Kernel: Linux raika 5.4.0-72-lowlatency #80-Ubuntu SMP PREEMPT Mon Apr 12 18:37:24 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
googler version 4.3.2 Python version 3.8.5 KDE Konsole Version 19.12.3