TapeWerm / MCscripts

Minecraft Java and Bedrock Dedicated Server systemd units and scripts for backups, automatic updates, and posting logs to chat bots
MIT License
154 stars 28 forks source link

No Response from https://www.minecraft.net #43

Closed lewebster closed 3 years ago

lewebster commented 3 years ago

Hi, Minecraft told me, that my Server ist out of Date and so i started looking for the Problem. Turned out that the mcbe-getzip Service isn't downloading the current Version for some time. Got the exact some Problem as https://github.com/TapeWerm/MCscripts/issues/42 So i tried figuring out by myself by setting some echo-command in mcbe_getzip.sh. The script stops at line webpage=$(wget --user-agent MCscripts --prefer-family=IPv4 -nv https://www.minecraft.net/en-us/download/server/bedrock/ -O -) For debugging i tried wget -d --user-agent MCscripts --prefer-family=IPv4 https://www.minecraft.net/en-us/download/server/bedrock/ -O - directly from bash and get

Setting --user-agent (useragent) to MCscripts
Setting --user-agent (useragent) to MCscripts
Setting --prefer-family (preferfamily) to IPv4
Setting --prefer-family (preferfamily) to IPv4
Setting --output-document (outputdocument) to -
Setting --output-document (outputdocument) to -
DEBUG output created by Wget 1.20.3 on linux-gnu.

Reading HSTS entries from /root/.wget-hsts
URI encoding = ‘UTF-8’
--2021-07-14 12:54:31--  https://www.minecraft.net/en-us/download/server/bedrock/
Resolving www.minecraft.net (www.minecraft.net)... 2.16.186.19, 2.16.186.27, 2.16.186.8
Caching www.minecraft.net => 2.16.186.19 2.16.186.27 2.16.186.8
Connecting to www.minecraft.net (www.minecraft.net)|2.16.186.19|:443... connected.
Created socket 3.
Releasing 0x00005575ca429f90 (new refcount 1).
Initiating SSL handshake.
Handshake successful; connected socket 3 to SSL handle 0x00005575ca42a180
certificate:
  subject: CN=*.minecraft.net,O=Mojang AB,L=Stockholm,C=SE
  issuer:  CN=DigiCert SHA2 Secure Server CA,O=DigiCert Inc,C=US
X509 certificate successfully verified and matches host www.minecraft.net

---request begin---
GET /en-us/download/server/bedrock/ HTTP/1.1
User-Agent: MCscripts
Accept: */*
Accept-Encoding: identity
Host: www.minecraft.net
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response...

and thats it.

Any ideas?

TapeWerm commented 3 years ago

Thanks for the verbosity. My server updated fine, I wonder if Mojang's CDN is blocking some IP. If you are comfortable answering, what country is your home server or cloud host from? It's 5 AM where I live, but my sleep schedule is a mess so I am wide awake right now. I wonder if the en-us is a bad link to have MCscripts default to.

tassaron commented 3 years ago

This happens to me as well. My server is a VPS on Digital Ocean, in Toronto Canada.

jared-hess commented 3 years ago

I think you are right about the CDN blocking certain IP's. The request times out when I run the wget request on my VPS, but it works when I run it from my home pc.

TapeWerm commented 3 years ago

Thanks for the info all. Definitely sounds like the CDN is up to no good again. I'll look into a workaround.

TapeWerm commented 3 years ago

CDN Planet says Amazon CloudFront is their CDN.

lewebster commented 3 years ago

Yep, can confirm, that it works flawlessly from my home machines. My Server is hosted by Hetzner in Germany so definitely a widespread IP-blocking.

Also tried with de-de but no luck either.

TapeWerm commented 3 years ago

wget --user-agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:90.0) Gecko/20100101 Firefox/90.0" -nv https://www.minecraft.net/en-us/download/server/bedrock -O - Devil's advocate to lie but does that work? Also, does the CDN vary by country?

lewebster commented 3 years ago

Changing the User-Agent gets the same result. Tried that before. Sorry, should've mentioned. I'm not sure about the different CDNs per Region.

TapeWerm commented 3 years ago

One solution I don't want to do but could do is poll the Minecraft Wiki for the link. The obvious problem is that is begging to get hacked. One malicious edit of the Minecraft Wiki and MCscripts puts minerd on all your Minecraft servers. I guess I could make sure the link is from https://minecraft.azureedge.net/ but that is crude. I might rewrite the web scraper in Python 3 or Rust at that point. Probably Python for ease of use. Is that a sane or stupid idea? Sounds kinda stupid to me, but definitely viable.

TapeWerm commented 3 years ago

Is this still an issue? Should someone make an issue on https://bugs.mojang.com/ ? If needed I can make a Python program to get the minecraft.azureedge.net URL from the Minecraft Wiki. It's gross tech debt to incorporate pip venv into a predominately Bash project but dirty deeds can be done for the low low price of venv.

lewebster commented 3 years ago

Yes, it's still an isue. If you've got any possibility to fix this, i'd be very grateful. Otherwise i'd manually need to download the new server versions from another system and copy it to my server.

TapeWerm commented 3 years ago

I noticed https://github.com/TheRemote/MinecraftBedrockServer uses curl and curl is picky about the trailing slash in the URL. What happens if you remove the trailing slash from wget?

TapeWerm commented 3 years ago

wget --user-agent MCscripts --random-wait -nv https://www.minecraft.net/en-us/download/server/bedrock -O - curl --user-agent 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)' https://www.minecraft.net/en-us/download/server/bedrock For some reason curl doesn't work with MCscripts as a user agent.

TapeWerm commented 3 years ago

I should also note if anyone cloned yesterday I broke the update script for a few hours. GitHub says 5 people cloned. I added an if regular file exists check before trying to copy a JSON file in case a future update adds more JSON files that will fail to copy as they don't exist in the old server. I forgot to use the general file exists instead of regular file exists so it only copied the whitelist.json and server.properties, deleting the worlds folder that wasn't copied. I didn't join the world to test it, I just looked at the server in the menu. So if you updated MCscripts yesterday please update it. It will backup your world, but it's better to not have to use that backup.

jared-hess commented 3 years ago

The results are different using curl, but it still doesn't work. Wget seems to just time out. But curl returns the following (both with and without the trailing slash).

$ curl --user-agent 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)' https://www.minecraft.net/en-us/download/server/bedrock/
<HTML><HEAD>
<TITLE>Access Denied</TITLE>
</HEAD><BODY>
<H1>Access Denied</H1>

You don't have permission to access "http&#58;&#47;&#47;www&#46;minecraft&#46;net&#47;en&#45;us&#47;download&#47;server&#47;bedrock&#47;" on this server.<P>
Reference&#32;&#35;18&#46;37e52517&#46;1626460326&#46;74ae266
</BODY>
</HTML>
lewebster commented 3 years ago

The results are different using curl, but it still doesn't work. Wget seems to just time out. But curl returns the following (both with and without the trailing slash).


$ curl --user-agent 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)' https://www.minecraft.net/en-us/download/server/bedrock/

<HTML><HEAD>

<TITLE>Access Denied</TITLE>

</HEAD><BODY>

<H1>Access Denied</H1>

You don't have permission to access "http&#58;&#47;&#47;www&#46;minecraft&#46;net&#47;en&#45;us&#47;download&#47;server&#47;bedrock&#47;" on this server.<P>

Reference&#32;&#35;18&#46;37e52517&#46;1626460326&#46;74ae266

</BODY>

</HTML>

confirmed, even without the slash at the end of the URL

TapeWerm commented 3 years ago

Does https://github.com/TheRemote/MinecraftBedrockServer work? I see TheRemote uses more args and a different user agent than I do. I'm curious why the curl I used doesn't work.

lewebster commented 3 years ago

Yeah, the curl command from the other repo works fine. seems like one of those additional parameters is the key. might check out, which ones are needed later.

TapeWerm commented 3 years ago

I hate to ask again, but are the -L flag, -H headers, or both key to it working? -L could be as innocent as a redirect to your locale. It's also possible a locale header of en or en-US would be enough to prevent a redirect making -L unneeded. My other most likely solution is to use Python 3 and bs4 to do the scraping cause Python Requests is less likely to break. I prefer Rust to Python 3 as a language cause I like servo and scraper but realistically a compiled language defies the ease of use I want from MCscripts. While scraping the Wiki is probably safe if I check the download domain, a griefer could just as easily list an old version as the latest version or a good faith editor could accidentally typo a link.

lewebster commented 3 years ago

As far as i can tell, both are the key. If i leave out any of those two -H Flags i get:

<HTML><HEAD>
<TITLE>Access Denied</TITLE>
</HEAD><BODY>
<H1>Access Denied</H1>

You don't have permission to access "http&#58;&#47;&#47;www&#46;minecraft&#46;net&#47;en&#45;us&#47;download&#47;server&#47;bedrock&#47;" on this server.<P>
Reference&#32;&#35;18&#46;17ba1002&#46;1626548292&#46;9d862e5
</BODY>
</HTML>

If i leave out -L i get:

<html>
<head><title>302 Found</title></head>
<body>
<center><h1>302 Found</h1></center>
<hr><center>nginx/1.21.0</center>
</body>
</html>
TapeWerm commented 3 years ago

Option 1 is for text/html, option 2 is for encoding identity. When I copy Firefox dev tools network cURL request, I don't see Accept-Encoding, I see Accept. According to MDN docs on content negotiation, text/html is a separate resource from encodings, and HTML is what I am after. Also, I tried scraping minecraft.net with Rust's reqwest library, it didn't work. Probably the user agent yet again, and if the library isn't going to handle user agent spoofing for me I may as well continue to do it myself in Bash. I will switch from wget to curl cause I am already posting with it in mcbe_log, and I am not using wget's recursive scraping like I've done in other projects.

1: curl -A 'Mozilla/5.0 (X11; Linux x86_64)' -H 'Accept: text/html' -H 'Accept-Language: en-US' -L https://www.minecraft.net/en-us/download/server/bedrock 2: curl -A 'Mozilla/5.0 (X11; Linux x86_64)' -H 'Accept-Encoding: identity' -H 'Accept-Language: en-US' -L https://www.minecraft.net/en-us/download/server/bedrock 3: curl -A 'Mozilla/5.0 (X11; Linux x86_64)' -H 'Accept-Language: en-US' --compressed -L https://www.minecraft.net/en-us/download/server/bedrock

Edit: I read more of that MDN article and the Accept and Accept-Encoding headers can be mixed. However Firefox does give a --compressed flag. Does option 3 work for encoding?

TapeWerm commented 3 years ago

Commit https://github.com/TapeWerm/MCscripts/commit/837028224e2ad0639e5b481bdd9329e053b2533b might fix this. At the least I tested both getzip and getjar to success on my end, but all that proves is I didn't break it for American users who are already fine. I hope this clears up the issues international users are having. It's a pain to manually pull the ZIP file, and I hope I have beaten the CDN at their little game. Thank you all for helping me debug this and if this commit didn't fix it for you I can poke it more. I do have homework and work work that needs to be done but debugging annoying issues like this is probably more educational to me.

lewebster commented 3 years ago

Well, that commit worked like a charme for me....issue closed. Thank you so much.