marcizhu / readme-chess

♟️ Play Multiplayer Chess in a README file!
https://github.com/marcizhu
MIT License
89 stars 14 forks source link

Suggestion: Include all python packages inside the repo #18

Closed happymimimix closed 2 months ago

happymimimix commented 4 months ago

image Every time a move is being done, all of these packages will need to be re-downloaded. This wastes a lot of time! So I would suggest include them in the repo so the action runs faster.

marcizhu commented 3 months ago

Hey @happymimimix, first of all thanks for your suggestion!

I'm very conscious that every time the GitHub Action runs all dependencies need to be redownloaded, so precisely for that reason I try to keep dependencies down to a minimum. As of today, this repo only depends on three external libraries (chess for reading the PGN files and generating the moves, pyGithub to interact with GitHub's API and pyYAML to read the YAML config files). The entire process of downloading all dependencies, installing them, processing the move, committing and pushing the changes usually takes around ~8s so I wouldn't call it "a lot of time".

Nevertheless, I've explored other options including what you suggested, but in the end the time saved on installing ended up being overshadowed by the clone step, as downloading hundreds of Python files is usually slower and more inefficient than downloading a single compiled and compressed wheel file. Also updating those dependencies would require a substantial manual effort which I'm not sure if it would be worth it... I've also experimented with caching but unfortunately it was caching all installed dependencies (including libraries installed by default) and in the end downloading this cache from GitHub was taking significantly longer than 7-8s.

If you have any other suggestions or ideas on how to optimize the run time of this repo I'm 100% open to hear them, but after quite extensive experimentation I believe the only way to make it any faster is to remove dependencies.

happymimimix commented 3 months ago

image Look, pip is taking 6 seconds to complete, the longest section in the whole process. Pip does takes up majority of the time. Maybe consider including the whl files in the repo then install them from the repo instead of pypi? Also, you can have a try modifying the whl files to remove some unnecessary functions that you don't need to save space, so the repo clones faster.

happymimimix commented 3 months ago

I also have another idea, Nuitka! Package your project into an exe file and upload it to the repo. Keep the source codes in another branch so the master branch clones super fast. Then just run the exe file in the github action.

In addition, you can consider porting this project to a compiled language such as C++ or Rust. Keep the source code in a separate branch so the master branch clones fast. Upload the compiled executable to the master branch then just call that exe file in the github actions.

This way you spend much less time setting up the environment and everything just work out of the box after the runner cloned your repo.

marcizhu commented 2 months ago

Sorry for the delay in getting back to this, I've been quite busy for the last few weeks and never got the time to reply. Let me address all the points you brought up on the last couple of messages:

Look, pip is taking 6 seconds to complete, the longest section in the whole process. Pip does takes up majority of the time.

Indeed, it does take the majority of time. But "the majority of time" is still 6 seconds. If I were able to speed this process up by 100% (which is impossible, but let's just imagine it) it would still only reduce the time by 6 seconds. More realistically though, it will just save 1-2s, if anything.

So although you're absolutely right, my point is that saving a couple of seconds isn't worth the extreme amount of complexity needed for that. If it was a quick win, a 5-minute thing, then sure. But that's not the case...

Maybe consider including the whl files in the repo then install them from the repo instead of pypi?

By doing this the clone time of the repo would increase, eating up almost all the performance gains from the pip install step. The time to download 10 MiB is the same, regardless of whether I download it from pip or from GitHub. Or it could be worse, considering that git is optimized for text files and delta compression, and not that good for big binary blobs...

Also, you can have a try modifying the whl files to remove some unnecessary functions that you don't need to save space, so the repo clones faster.

Again, I'm not willing to invest huge amounts of time and effort to maintain and manually modify wheel files from libraries to shave off a couple of seconds from this project... The amount of time I would have to put into it would be orders of magnitude greater than the total time saved by all players in the entire lifetime of this project.

Package your project into an exe file and upload it to the repo. Keep the source codes in another branch so the master branch clones super fast. Then just run the exe file in the github action.

The main 'con' I see to this approach is ideological: the project is supposed to be open source, free for anyone to see, contribute, modify or run locally if they so wish. Compiling an EXE and running a binary from a different repo in a GitHub Action seems shady to me, and I don't think people would be comfortable with that. Plus, again, there's the extra time and effort required to recompile the EXE every time Python is updated, or some dependency is updated, or when someone contributes to the project...

In addition, you can consider porting this project to a compiled language such as C++ or Rust. Keep the source code in a separate branch so the master branch clones fast. Upload the compiled executable to the master branch then just call that exe file in the github actions.

I did this project back in 2020 when I was still in university as a side-project to learn Python. C++ would be faster, indeed, but there are no good libraries for playing chess (at least none that I know of) nor lightweight GitHub clients. If avoid installing python-chess and pyGitHub but I need to install stockfish and curl instead, what's the point? Installing those dependencies will take way more. In fact, the GitHub action already installs stockfish to evaluate the move and next best move, and it already takes 9s just for stockfish. It's just not worth it, even less if you count that I'd have to rewrite everything from scratch...

Everything you mention is a nice idea, and I'd be open to consider those if this was taking a significant amount of time per move. But honestly, at this stage I don't have the time nor the willingness to invest months of my personal time managing multiple repos with binaries, or manually editing wheel files, or rewriting the whole project from scratch in C++ just to save 1 or 2 seconds. It's just not worth it, and I hope you agree with me.

For these reasons, I will close this off for now. Maybe in the future I discover some new approach and I revisit it. Thank you for the suggestion, though! It is always appreciated 😊

happymimimix commented 1 month ago

Thanks for your reply.