UlionTse / translators

🌏🌍🌎Translators🌎🌍🌏 is a library that aims to bring free, multiple, enjoyable translations to individuals and students in Python. Translators是一个旨在用Python为个人和学生带来免费、多样、愉快翻译的库。
https://pypi.org/project/translators/
GNU General Public License v3.0
1.62k stars 189 forks source link

[Bug]: workaround of consent.google.com doesn't work anymore #142

Closed Cabu closed 11 months ago

Cabu commented 11 months ago

Debug Tips

What happened?

The trick to get anew the host_url with the consent_cookie (server.py lines 591 to 593) doesn't seems to work anymore. It seems that now a POST with a bung of inputs should be made.

APP Version

5.8.3

Python Version

3.9 (Default)

Runtime Environment

Linux CentOS (Default)

Country/Region

Belgium

Relevant log output

Part of host_html when receiving the consent page containing the POST and its fields:

<form action="https://consent.google.com/save" method="POST" style="display:inline;" jsaction="JIbuQc:ldDdv(b3VHJd)"><div class="lssxud"><div class="VfPpkd-dgl2Hf-ppHlrf-sM5MNb" data-is-touch-wrapper="true"><button class="VfPpkd-LgbsSe VfPpkd-LgbsSe-OWXEXe-k8QpJ VfPpkd-LgbsSe-OWXEXe-dgl2Hf nCP5yc AjY5Oe DuMIQc LQeN7 Nc7WLe" jscontroller="soHxf" jsaction="click:cOuCgd; mousedown:UX7yZ; mouseup:lbsD7e; mouseenter:tfO1Yc; mouseleave:JywGue; touchstart:p6p2H; touchmove:FwuNnf; touchend:yfqBxc; touchcancel:JMtRjd; focus:AHmuwe; blur:O22p3e; contextmenu:mg9Pef;mlnRJb:fLiPzd;" data-idom-class="nCP5yc AjY5Oe DuMIQc LQeN7 Nc7WLe" jsname="b3VHJd" aria-label="Accept all"><div class="VfPpkd-Jh9lGc"></div><div class="VfPpkd-J1Ukfc-LhBDec"></div><div class="VfPpkd-RLmnJb"></div><span jsname="V67aGc" class="VfPpkd-vQzf8d" aria-hidden="true">Accept all</span></button></div></div><input type="hidden" name="gl" value="IE"><input type="hidden" name="m" value="0"><input type="hidden" name="app" value="0"><input type="hidden" name="pc" value="t"><input type="hidden" name="continue" value="https://translate.google.com/"><input type="hidden" name="x" value="6"><input type="hidden" name="bl" value="boq_identityfrontenduiserver_20230910.08_p0"><input type="hidden" name="hl" value="en-US"><input type="hidden" name="src" value="1"><input type="hidden" name="cm" value="4"><input type="hidden" name="set_sc" value="true" required><input type="hidden" name="set_aps" value="true" required><input type="hidden" name="set_eom" value="false" required></form>

Screenshots

No response

Code of Conduct

Cabu commented 11 months ago

Possible patch that work for me:

In class GoogleV2(Tse), add the following method:

    def get_consent_post(self, consent_html: str) -> dict:
        et = lxml.etree.HTML(consent_html)
        form_element = et.xpath('.//form[1]')
        post_url = form_element[0].attrib.get('action') if form_element else 'https://consent.google.com/save'
        input_elements = form_element[0].xpath('.//input[@type="hidden"]')
        post_values = {input.attrib.get('name'): input.attrib.get('value') for input in input_elements}
        return post_url, post_values

in the method google_api, replace around line 591

            if 'consent.google.com' == urllib.parse.urlparse(r.url).hostname:
                self.host_headers.update({'cookie': self.get_consent_cookie(r.text)})
                host_html = self.session.get(self.host_url, headers=self.host_headers, timeout=timeout, proxies=proxies).text
            else:

by

            if 'consent.google.com' == urllib.parse.urlparse(r.url).hostname:
                post_url, post_form = self.get_consent_post(r.text)
                host_html = self.session.post(post_url, data=post_form, headers=self.host_headers, timeout=timeout, proxies=proxies).text
            else:

self.host_headers.update({'cookie': self.get_consent_cookie(r.text)}) and the method get_consent_cookie() doesn't work anymore...

Cabu commented 11 months ago

Confirmation! self.host_headers.update({'cookie': self.get_consent_cookie(r.text)}) and the method get_consent_cookie() are not necessary anymore and could be deleted!

UlionTse commented 11 months ago

@Cabu Copy that, sir!

UlionTse commented 11 months ago

@Cabu Bro, you can attention this -> #35 and #57

Cabu commented 11 months ago

I don't know about #35, but my patch clearly solve #57 in the proper way (simulating the click on "Accept all") without the proposed make_temp_language_map workaround.

UlionTse commented 11 months ago

Please upgrade v5.8.5

Cabu commented 11 months ago

5.8.5 doesn't work for me. get_languages get blocked by the cookie consent form :(

UlionTse commented 11 months ago

Confirmation! self.host_headers.update({'cookie': self.get_consent_cookie(r.text)}) and the method get_consent_cookie() are not necessary anymore and could be deleted!

@Cabu ?

Cabu commented 11 months ago

Your new version deleted the "get_consent_cookie" that doesn't worked anymore and that is nice, but it doesn't implement the patch with "get_consent_post" I proposed to work around the European cookie consent kit google implemented.

i updated my proposed patch accordingly.

UlionTse commented 11 months ago

@Cabu Bro, for your contributions, welcome to your PR.

UlionTse commented 11 months ago

@Cabu Bro, please pip install --upgrade translators==5.8.6.