guidopetri / chess-pipeline

Pulling games from the Lichess API into a PostgreSQL database for data analysis.
GNU General Public License v3.0
20 stars 2 forks source link

Migrate from RPi to VPS #20

Closed guidopetri closed 4 years ago

guidopetri commented 4 years ago

Running this on a raspberry pi is super cool, but it's also super slow. Running Stockfish, especially, has made me realize that the RPi isn't really that great for "production-level" code - it's more like a testing ground. Additionally, the RPi comes with a few of its own problems: I need to have it connected somewhere for it to work (VPSs are always on), it needs to be on a DynDNS/permanent IP for anything to query from the DB, and opening ports on a home network (for the RPi to be open to queries from e.g. a VPS) isn't the safest thing in the world.

As such, I need to work on migrating the current setup to a VPS and making sure that it works there. Once it's up and running successfully, I can de-commission the RPi from production to testing/development.

guidopetri commented 4 years ago

This is currently in progress. The VPS I've set up can now run the pipeline, but since it's computationally more powerful, it's re-running the analysis for all games (with depth 15 instead of 10).

Once the data all the way back to 2016 is imported and cron jobs are set up, I'll leave the pipeline running on both the VPS and the RPi to make sure nothing breaks, and if a week passes, I'll migrate the newsletter over - then after another week, I'll switch the RPi to be the development server.

Additionally, Redash (on the VPS) is already able to connect to the data - so it's definitely working properly. The only disadvantage to this is that to access the data externally (e.g. on my personal laptop/PC), I'd have to open the Postgres ports. I'm still thinking about how to deal with this cybersec issue.

guidopetri commented 4 years ago

This is nearly done. All the data has been loaded into the VPS (with new, improved, depth 15 evals - which is why I couldn't just copy stuff over), and it seems that the cron jobs work (with the caveat that some docker services need to be stopped while the pipeline is running - too little available RAM, I guess).

I'm going to leave it on for a few more days to make sure everything is working fine and if Monday rolls around and there are no problems, then I'll consider this a success and start working off of the VPS as "prod" and the RPi as "stage".

guidopetri commented 4 years ago

This looks like it was a success. From now on, prod and stage are a thing!