libwww-perl / WWW-Mechanize

Handy web browsing in a Perl object
https://metacpan.org/pod/WWW::Mechanize
Other
68 stars 52 forks source link

Failed to run on IDN domain name #325

Closed aponomarenko closed 2 years ago

aponomarenko commented 2 years ago

Hello!

All works fine with standard domain names. But for the international domain name I get:

Error GETing https://xn--agb0uh1idcdb.xn-----eabag7ad7b4a86af9a0ao5dfm6boib51bbbbcbbcbbbegqkbdbbb.xn--ai3ub/index.php?action=login2;sa=check;member=11020: Can't connect to xn--agb0uh1idcdb.xn-----eabag7ad7b4a86af9a0ao5dfm6boib51bbbbcbbcbbbegqkbdbbb.xn--ai3ub:443 (Name or service not known) at robot.pl line 27.

Code:

#!/usr/bin/perl

use Encode qw(decode);
use WWW::Mechanize;
use IO::Socket::SSL;
use HTTP::Cookies;

my $FORUM = "https://форум.школа-женского-счастья.рф";

$Mech = WWW::Mechanize->new();
$Mech->cookie_jar(HTTP::Cookies->new());
$Mech->get(decode("UTF-8", $FORUM)."/index.php");
$Mech->form_id("guest_form");

$Mech->field("user", "..."));
$Mech->field("passwrd", "...");
$Mech->click;

$Mech->get works, but $Mech->click doesn't. It's trying to get xn--agb0uh1idcdb.xn-----eabag7ad7b4a86af9a0ao5dfm6boib51bbbbcbbcbbbegqkbdbbb.xn--ai3ub, but punnycode of my domain name is xn--l1adgmc.xn-----6kccnlg3aef0aghb5bfbs6ge7h0c.xn--p1ai. Looks like punnycoding doesn't work. How to workaround this?

I'm using latest version of WWW-Mechanize 2.05 on Fedora 34.

Looking forward for your help.

simbabque commented 2 years ago

Thank you for your report. I've tried your code on WWW::Mechanize 2.06 after fixing the syntax error in line 13, and it works. Here is my output. This uses LWP::ConsoleLogger for debugging.

I can't read the output, but it looks like it's the same site saying the credentials are wrong. The URLs are definitely the same.

$  perl -MLWP::ConsoleLogger::Everywhere -I lib test.pl
GET https://xn--l1adgmc.xn-----6kccnlg3aef0aghb5bfbs6ge7h0c.xn--p1ai/index.php

.---------------------------------+--------------------.
| Request (before sending) Header | Value              |
+---------------------------------+--------------------+
| Accept-Encoding                 | gzip               |
| User-Agent                      | WWW-Mechanize/2.06 |
'---------------------------------+--------------------'

.--------------------------------+--------------------.
| Request (after sending) Header | Value              |
+--------------------------------+--------------------+
| Accept-Encoding                | gzip               |
| User-Agent                     | WWW-Mechanize/2.06 |
'--------------------------------+--------------------'

==> 200 OK

.-------------------------+--------------------------------------------------------------.
| Response Header         | Value                                                        |
+-------------------------+--------------------------------------------------------------+
| Cache-Control           | private                                                      |
| Client-Date             | Thu, 21 Oct 2021 13:54:05 GMT                                |
| Client-Peer             | 178.250.156.91:443                                           |
| Client-Response-Num     | 1                                                            |
| Client-SSL-Cert-Issuer  | /C=US/O=Let's Encrypt/CN=R3                                  |
| Client-SSL-Cert-Subject | /CN=xn--l1adgmc.xn-----6kccnlg3aef0aghb5bfbs6ge7h0c.xn--p1ai |
| Client-SSL-Cipher       | ECDHE-RSA-AES128-GCM-SHA256                                  |
| Client-SSL-Socket-Class | IO::Socket::SSL                                              |
| Connection              | close                                                        |
| Content-Encoding        | gzip                                                         |
| Content-Length          | 6245                                                         |
| Content-Type            | text/html; charset=UTF-8                                     |
| Date                    | Thu, 21 Oct 2021 13:54:05 GMT                                |
| Expires                 | Mon, 26 Jul 1997 05:00:00 GMT                                |
| Last-Modified           | Thu, 21 Oct 2021 13:54:05 GMT                                |
| Pragma                  | no-cache                                                     |
| Server                  | Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips PHP/7.4.24         |
| Set-Cookie              | PHPSESSID=eo3oubmmjed5ph9n7ho9k1rom2; path=/                 |
| Vary                    | Accept-Encoding                                              |
| X-Content-Type-Options  | nosniff                                                      |
| X-Frame-Options         | SAMEORIGIN                                                   |
| X-Powered-By            | PHP/7.4.24                                                   |
| X-XSS-Protection        | 1                                                            |
'-------------------------+--------------------------------------------------------------'

...
POST https://xn--l1adgmc.xn-----6kccnlg3aef0aghb5bfbs6ge7h0c.xn--p1ai/index.php

GET Params:

.-----------+----------------------------.
| Key       | Value                      |
+-----------+----------------------------+
| PHPSESSID | eo3oubmmjed5ph9n7ho9k1rom2 |
| action    | login2                     |
'-----------+----------------------------'

POST Params:

.-------------------+----------------------------------.
| Key               | Value                            |
+-------------------+----------------------------------+
| cookielength      | -1                               |
| f701fdc0fe0a      | 5a434c0c0edf90bb4a0244dfadeabacf |
| hash_passwrd      |                                  |
| openid_identifier |                                  |
| passwrd           | ...                              |
| user              | ...                              |
'-------------------+----------------------------------'

.---------------------------------+----------------------------------------------------------------------------.
| Request (before sending) Header | Value                                                                      |
+---------------------------------+----------------------------------------------------------------------------+
| Accept-Encoding                 | gzip                                                                       |
| Content-Length                  | 115                                                                        |
| Content-Type                    | application/x-www-form-urlencoded                                          |
| Cookie                          | PHPSESSID=eo3oubmmjed5ph9n7ho9k1rom2                                       |
| Cookie2                         | $Version="1"                                                               |
| Referer                         | https://xn--l1adgmc.xn-----6kccnlg3aef0aghb5bfbs6ge7h0c.xn--p1ai/index.php |
| User-Agent                      | WWW-Mechanize/2.06                                                         |
'---------------------------------+----------------------------------------------------------------------------'

.---------------------------------------------------------------------------------------------------------------------.
| Content                                                                                                             |
+---------------------------------------------------------------------------------------------------------------------+
| user=...&passwrd=...&cookielength=-1&openid_identifier=&hash_passwrd=&f701fdc0fe0a=5a434c0c0edf90bb4a0244dfadeabacf |
'---------------------------------------------------------------------------------------------------------------------'

...
==> 200 OK

Title: Школа Женского Счастья - Главная страница

.-------------------------+--------------------------------------------------------------.
| Response Header         | Value                                                        |
+-------------------------+--------------------------------------------------------------+
| Cache-Control           | private                                                      |
| Client-Date             | Thu, 21 Oct 2021 13:54:05 GMT                                |
| Client-Peer             | 178.250.156.91:443                                           |
| Client-Response-Num     | 1                                                            |
| Client-SSL-Cert-Issuer  | /C=US/O=Let's Encrypt/CN=R3                                  |
| Client-SSL-Cert-Subject | /CN=xn--l1adgmc.xn-----6kccnlg3aef0aghb5bfbs6ge7h0c.xn--p1ai |
| Client-SSL-Cipher       | ECDHE-RSA-AES128-GCM-SHA256                                  |
| Client-SSL-Socket-Class | IO::Socket::SSL                                              |
| Connection              | close                                                        |
| Content-Encoding        | gzip                                                         |
| Content-Length          | 3923                                                         |
| Content-Type            | text/html; charset=UTF-8                                     |
| Date                    | Thu, 21 Oct 2021 13:54:05 GMT                                |
| Expires                 | Mon, 26 Jul 1997 05:00:00 GMT                                |
| Last-Modified           | Thu, 21 Oct 2021 13:54:05 GMT                                |
| Pragma                  | no-cache                                                     |
| Server                  | Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips PHP/7.4.24         |
| Vary                    | Accept-Encoding                                              |
| X-Content-Type-Options  | nosniff                                                      |
| X-Frame-Options         | SAMEORIGIN                                                   |
| X-Powered-By            | PHP/7.4.24                                                   |
| X-XSS-Protection        | 1                                                            |
'-------------------------+--------------------------------------------------------------'
lathropd commented 2 years ago

This should probably be closed…

oalders commented 2 years ago

Thanks, everyone!