Open Qiangong2 opened 7 months ago
is there compatibility with the old script?
The new script lacks compatibility with the old script as it fundamentally changes how, and what, data is archived. The old script only selected a few chunks of data to save, in an effort to make the metadata file "clean looking" but as a result it threw away a lot of data which would be rather important/nice to have when making a backup. This script also supports object versioning for when DataStore objects update
The old script also only backed up course data, but lacked a LOT of maker data
The tradeoff with the extra data is that it also takes up more disk space. Right now we have a test instance running on a dedi which has downloaded 142,677 objects (objects can be courses, event courses, makers, etc. Not just courses) and currently the total disk space used, metadata included, is 4.7GB
This new script should, in theory, be much faster than the old one as well. The old script would go one by one through the entire uint64 space to look for courses, whereas this script starts at the first known good ID, and checks a configurable number of chunks of 100 IDs at a time. The server only allows up to 100 IDs to be checked per request, and by default this script tries to check 20 chunks of IDs at a time, so 2,000 IDs at once. It then downloads each objects data in parallel. However this also depends on your internet speeds and machine specs (some machines may struggle writing thousands of files at once)
Also, when I try and run the new script in this repo, I receive the following error
You need to update to Python 3.9 or later
Do you still want me to continue archiving courses? Or since you have an instance already, you have it covered now? I have the storage space to handle the extra data, just want to know if it's needed.
Sure, we're still trying to figure out the logistics of ours too so having someone else making a backup just in case is always nice. I would suggest holding off though, as we are discussing some other ways to improve the archiving which may change things
I ping you here when we've settled
@Qiangong2 I believe we have settled on an archiving method now. So feel free to begin your own backup now
After some rough testing, this new script was able to download around 300,000 objects from DataStore in a 12 hour period
There are currently, as of November 29th 2023, 15,651,599 objects in the Super Mario Maker DataStore server, so it should take roughly 25-30 days running non-stop to back all of them up
Do note though that "object" is not the same thing as "course". DataStore is a generic protocol used by many Nintendo games, as a way to interact with AWS S3 objects. It's basically just a fancy file upload/download protocol. An "object" is just any file inside DataStore (for instance, Animal Crossing: New Leaf uses DataStore objects for dream towns)
Super Mario Maker uses objects for many different things besides courses, including:
Super Mario Maker for the WiiU sold around 4 million copies world wide, so around 4-5 million of the 15,651,599 objects will be "maker" objects depending on things like the number of copies shared with other players, times it was pirated, etc. These maker objects will have a file size of 0, I don't know why Nintendo did it this way, so don't be alarmed by that
There is only 1 Event Course metadata file, which will be somewhat large (around 500kb), and then like 10 or so event courses (these have object IDs in the 9X0000 range)
Other objects should just be regular user made course objects
The script gz compresses all metadata files using compression level 6, to try and save space. Even so, our instance of the script has downloaded 625,278 total objects with a total of 28GB of storage used, due to the additional metadata. So storage needs have definitely increased since the old script
Also it is HIGHLY recommended that you use some kind of process manager like PM2 to run the script. The old script would check each possible DataStore object ID one by one, however this new script processes up to 100 objects at a time. This very often will cause the script to die due to trying to mass download files from S3. Also Nintendo's Super Mario Maker server seems a tad more unstable these days, so sometimes the script disconnects completely. A process manager like PM2 will automatically restart the process when it crashes
Alright, I got it running. Seems to be working so far.
Is the "objects" folder the courses? I can't read apparently
All the other folders are the exact same size. Is that due to it indexing every single ID?
EDIT: Yeah... It's moving a lot faster than the old script lol
Did you pull the latest commit? You should have a last-checked-timestamp.txt
file, not a last-checked-offset.txt
file. That file is from an older version of this script which had a bug that would cause it to slow down exponentially as time went on
All the other folders are the exact same size. Is that due to it indexing every single ID?
The other files are fairly small, they're gzip compressed JSON data. The majority of them are less than a kilobyte in size. So yes they should generally be around the same size when viewed like that (though if you check the actual byte size the folders take up, it will be different)
Is there a way to check how many courses you have downloaded vs objects?
Was my account banned?
I get nothing but RuntimeError: PRUDP connection failed
Whenever I try and run the script
Very unlikely. I've been running the script for several weeks now non-stop and it's fine
We have experienced this too at times, and it usually resolves itself
Our best guess is that Nintendo has some form of temporary ratelimit in place
The longest I had this happen is a couple minutes. If you were banned, you would almost certainly see an actual ban error from the server (it supports these, I just never saw one personally)
Hmm. Alright. I still get the PRUDP error, and looking at the pm2 log, I've been getting it for the past 8 hours. I'll hold off until tomorrow and see if it lets me back in. I'm at 165GB of course data at the moment.
It's not a huge deal if I was banned, it'd just kinda suck since I've had the account for nearly a decade :/
@jonbarrow Still getting the PRUDP Connection failure error. I was able to play a Mario Kart match using the same account, so my account isn't banned. Is it possible I was IP blocked?
Super Mario Maker and Mario Kart 8 should be using the same authentication server. If you were IP banned in Super Mario Maker, you would also be banned in Mario Kart 8 (assuming the ban is on the authentication server). Nintendo only uses 2 different authentication servers for all games
Try actually going online in Super Mario Maker and see what happens. I still find it doubtful that they blocked you, since our script has been running for several weeks non-stop
My server and Wii U are in different locations, which is why I assumed it might be an IP block.
Here's the error:
Packet timed out: <PRUDPPacket type=TYPE_SYN flags=NEED_ACK seq=0 frag=0>
Traceback (most recent call last):
File "/smm1-archive/archival-tools/super-mario-maker/archive.py", line 439, in <module>
anyio.run(main)
File "/home/kurt/.local/lib/python3.9/site-packages/anyio/_core/_eventloop.py", line 70, in run
return asynclib.run(func, *args, **backend_options)
File "/home/kurt/.local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 292, in run
return native_run(wrapper(), debug=debug)
File "/usr/lib/python3.9/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/usr/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
return future.result()
File "/home/kurt/.local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 287, in wrapper
return await func(*args)
File "/smm1-archive/archival-tools/super-mario-maker/archive.py", line 391, in main
async with be.login(NEX_USERNAME, NEX_PASSWORD) as client:
File "/usr/lib/python3.9/contextlib.py", line 175, in __aenter__
return await self.gen.__anext__()
File "/home/kurt/.local/lib/python3.9/site-packages/nintendo/nex/backend.py", line 81, in login
async with rmc.connect(self.settings, host, port, stream_id, context, creds, servers) as client:
File "/usr/lib/python3.9/contextlib.py", line 175, in __aenter__
return await self.gen.__anext__()
File "/home/kurt/.local/lib/python3.9/site-packages/nintendo/nex/rmc.py", line 286, in connect
async with prudp.connect(settings, host, port, vport, 10, context, credentials) as client:
File "/usr/lib/python3.9/contextlib.py", line 175, in __aenter__
return await self.gen.__anext__()
File "/home/kurt/.local/lib/python3.9/site-packages/nintendo/nex/prudp.py", line 1551, in connect
async with transport.connect(vport, type, credentials, disconnect_timeout=disconnect_timeout) as client:
File "/usr/lib/python3.9/contextlib.py", line 175, in __aenter__
return await self.gen.__anext__()
File "/home/kurt/.local/lib/python3.9/site-packages/nintendo/nex/prudp.py", line 1388, in connect
await client.handshake(credentials, group)
File "/home/kurt/.local/lib/python3.9/site-packages/nintendo/nex/prudp.py", line 813, in handshake
raise RuntimeError("PRUDP connection failed")
RuntimeError: PRUDP connection failed
It times out on the first packet, but I'm unsure why. If it was denying the credentials, it would say that, right?
My server and Wii U are in different locations
Where is the server at? And does the old script still work on that server? It's possible this server really does just have a terrible connection to Nintendo's
It times out on the first packet, but I'm unsure why. If it was denying the credentials, it would say that, right?
Yes, it should be sending a real error. Not just timing out. This isn't even making it to the authentication server
Where is the server at? And does the old script still work on that server? It's possible this server really does just have a terrible connection to Nintendo's
The old script does work. Also, the server is in Oracle Cloud with a 2 gigabit connection out. I haven't had issues before this, which is odd.
Yes, it should be sending a real error. Not just timing out. This isn't even making it to the authentication server
The last-checked-timestamp is 135299250082
if that's significant.
Actually, the script does work if I manually change the last-checked-timestamp. It just gives this continuously:
Downloading next 100 objects between 27-1-2016 11:30:34 to 27-1-2016 23:30:34
Found 1 objects
Skipping 28105732
More objects may be available, trying new offset!
Downloading next 100 objects between 27-1-2016 11:30:34 to 27-1-2016 23:30:34
Found 1 objects
Skipping 28105732
More objects may be available, trying new offset!
Downloading next 100 objects between 27-1-2016 11:30:34 to 27-1-2016 23:30:34
Found 1 objects
Skipping 28105732
More objects may be available, trying new offset!
Downloading next 100 objects between 27-1-2016 11:30:34 to 27-1-2016 23:30:34
Found 1 objects
Skipping 28105732
More objects may be available, trying new offset!
Downloading next 100 objects between 27-1-2016 11:30:34 to 27-1-2016 23:30:34
Found 1 objects
Skipping 28105732
More objects may be available, trying new offset!
Downloading next 100 objects between 27-1-2016 11:30:34 to 27-1-2016 23:30:34
Found 1 objects
Skipping 28105732
Until I ctrl-c
We have released a statement about the connection issues. It's not a you thing, Nintendo seems to have fucked up https://twitter.com/PretendoNetwork/status/1736325668412031255
Ah, that makes sense. At least I know I'm not crazy :D
Is there a way in the script to force it to always connect to the same IP?
The script seems to be stuck downloading the same course over and over ever since I changed the last-changed-timestamp manually. Is there a way to get the script back on track (besides deleting everything and starting from scratch)?
hi @Qiangong2 and @jonbarrow, I wanted to see if you guys were still running the backups as the servers are getting to EOL.
hi @Qiangong2 and @jonbarrow, I wanted to see if you guys were still running the backups as the servers are getting to EOL.
We announced via Twitter several months ago that our scan had finished and we have a full backup
This is a continuation of https://github.com/jonbarrow/smm1-course-archive/issues/2.
On the previous version of the script, I've been able to archive over 3 million courses (about 190G altogether). With the new script, will everything have to be archived again? Or is there compatibility with the old script?
Also, when I try and run the new script in this repo, I receive the following error: