xroche / httrack

HTTrack Website Copier, copy websites to your computer (Official repository)
http://www.httrack.com/
Other
3.38k stars 645 forks source link

Can not bear crazy server (Moved Permanently) for https://max.skyrock.com/ #258

Open KnuX opened 1 year ago

KnuX commented 1 year ago

Hi,

I compiled a fresh version of WinHTTrack, the app is running well but when going to the copy step, the log is full of warnings :

HTTrack3.49-2 launched on Thu, 17 Aug 2023 17:26:39 at https://max.skyrock.com/ +*.png +*.gif +*.jpg +*.jpeg +*.css +*.js -ad.doubleclick.net/* -mime:application/foobar
(winhttrack -qwC2%Pb0%s%u%I0c1024R3H0%kf2A0%c512#L0%f%C#f -F "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)" -%F  -%l "fr, en, *" https://max.skyrock.com/ -O1 "F:\Mes Sites Web\test" +*.png +*.gif +*.jpg +*.jpeg +*.css +*.js -ad.doubleclick.net/* -%! -mime:application/foobar )
Information, Warnings and Errors reported for this mirror:
note: the hts-log.txt file, and hts-cache folder, may contain sensitive information,
 such as username/password authentication for websites mirrored in this project
 do not share these files/folders if you want these information to remain private
17:26:39 Warning:  * security warning: !!! BYPASSING SECURITY LIMITS - MONITOR THIS SESSION WITH EXTREME CARE !!!
17:26:39 Warning:  Moved Permanently for https://max.skyrock.com/robots.txt
17:26:39 Warning:  Redirected link is identical because of 'URL Hack' option: https://max.skyrock.com/robots.txt and https://max.skyrock.com/robots.txt
17:26:39 Warning:  Can not bear crazy server (Moved Permanently) for https://max.skyrock.com/robots.txt
17:26:39 Warning:  Moved Permanently for https://max.skyrock.com/
17:26:39 Warning:  Redirected link is identical because of 'URL Hack' option: https://max.skyrock.com/ and https://max.skyrock.com/
17:26:39 Warning:  Can not bear crazy server (Moved Permanently) for https://max.skyrock.com/
17:26:39 Warning:  No data seems to have been transferred during this session! : restoring previous one!

I have not this issue with the latest official version using the same parameters...

curl is giving this result:

.\curl.exe -i -A "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)" https://max.skyrock.com/robots.txt
HTTP/2 200
server: Apache
vary: Accept-Encoding
cache-control: public, max-age=2678400
content-type: text/plain; charset=ISO-8859-15
content-security-policy: upgrade-insecure-requests
strict-transport-security: max-age=15552000; includeSubDomains
date: Thu, 17 Aug 2023 15:37:57 GMT
set-cookie: locale=fr_FX; path=/; domain=skyrock.com; secure; httponly
set-cookie: plocale=fr_FX; expires=Tue, 13-Feb-2024 15:37:57 GMT; Max-Age=15552000; path=/; domain=skyrock.com; secure; httponly
set-cookie: tz=Europe%2FParis; path=/; domain=skyrock.com; secure; httponly
content-length: 252

User-agent: *
Disallow: /common/captcha.php
Disallow: /*?connect=1*
Disallow: /*?*&connect=1*
Disallow: /*?action=ADD_COMMENT

User-agent: bingbot
Crawl-delay: 30

Sitemap: https://max.skyrock.com/atom.xml
Sitemap: https://max.skyrock.com/sitemap.xml

Regards, KnuX

KnuX commented 1 year ago

OK, I had to manually unset this 2 options that was automatically checked: image