tilburgsciencehub / music-to-scrape

A fictitious music streaming service with a real website and API so you can learn how to scrape!
https://music-to-scrape.org
2 stars 6 forks source link

Make site / API launch with docker #29

Closed hannesdatta closed 10 months ago

hannesdatta commented 1 year ago

We're about to finalize this project: a website and API that operates a fictious streaming service that students can use to learn web scraping and APIs.

We seek to ask our IT department to host this, and require a DOCKER IMAGE that launches everything.

Here's what you need to deliver: a docker compose file with R and Python and all required packages, that (a) first simulates the data, and (b) then launches the Front end and API.

Maybe something for @DiSanchz and @Fernando-Iscar? Probably @DiSanchz has more experience but @Fernando-Iscar may want to learn this too? Let me know please!

DiSanchz commented 1 year ago

Sounds good to me @hannesdatta ! After taking a look at the readme of the project and considering the requirements for the Docker image, preparing the Dockerfile and docker-compose should be relatively straightforward. I'll get started on this tomorrow and will keep you updated on its progress. If @Fernando-Iscar would like to collaborate that would also be great!

Fernando-Iscar commented 1 year ago

For sure! I’ll be happy to contribute on it as well @hannesdatta. I’ll arrange it with @DiSanchz 👍

hannesdatta commented 1 year ago

Thanks, both! @DiSanchz, maybe you can come up with a way to teach @Fernando-Iscar how to do that (rather than just doing it yourself). That's more sustainable and in the future, we can distribute the Docker work on more shoulders. If you learn something valuable on the way that's not part of TSH, you can draft a new issue and post it on the site.

Thanks both of you!

hannesdatta commented 1 year ago

Btw - I think you guys ideally start w/ a Docker image that has Python and R in it already... there are also ample templates online on how to run Flask & FastAPI with Docker/Docker compose... So, wouldn't do everything from scratch. Anyway, you'll figure it out...

DiSanchz commented 1 year ago

A quick update @hannesdatta : The structure of the image and the docker-compose are ready, but at the time of testing that the app works as expected, we found ourselves stuck at an execution error of the "simulate.R" script inside the container. It returns an unexpected error regarding the connection to the database and is not able to produce the required file to be used by the API and the Front end. We have explored different workarounds today but none has worked so far.

We will continue working to solve this issue. I also thought of maybe asking Thierry about this error in case he could provide us with some fresh intuition on what may be happening.

We'll keep you updated on how it progresses. Also, sorry for not providing an earlier update. We intended to let you know once everything was ready and working as we expected to have this done already. However, we have found some other unforeseen setbacks similar to the one we are facing now that delayed us from the plan we had initially.

@Fernando-Iscar

hannesdatta commented 1 year ago

Can you push your intermediate solution? Maybe I can solve it...

DiSanchz commented 1 year ago

Sure! Just pushed it to a new branch!

This push includes the current state of the dockerfile to build the image, the docker-compose and a third file "supervisord.conf" which is used to configure the process manager "supervisor", which is a way we found to run the API and the Frontend at once in a single docker container. Alternatively, we also consider running the API and the Frontend in two separate but connected (through the docker-compose) docker containers although we still have to evaluate which option works best. Let us know any questions or suggestions you may have about the pushed files!

DiSanchz commented 1 year ago

Regarding the error which is currently holding us back related to the generation of the data (running the script simulate.R) here is a screenshot from it. This corresponds to an attempt of running the script manually through the command line already inside the container:

error_db

We obtain the exact same result when the script is executed by the docker-compose.

We have tried to execute the script in alternative containers based on other images with R pre-installed, run it in local, check for typos within the dockerfiles or for any missing dependencies in these regarding the execution of the script in general or r-sqlite in particular as well as try to use different versions of R but it seems we have not hit the right key yet regarding this.

error_line

We've also found that whatever isn't working as expected within containers in this regard is related to this line of code (87) shown above. Until that point everything seems to be working fine.

hannesdatta commented 1 year ago

Hey guys,

in this line of code, you're instructing docker to run the R script from the root directory. So, R thinks it is in the root, not in app/src/simulate.

I have changed the location of the database in simulate.R relative to that root, which then works. To learn which working dir R as, I'm now issuing a warning command with its current working directory, and then redefined other directories accordingly.

Remember that when you're prototyping, you can set n_users to something much smaller, say 20 (so the code runs much faster). I did this for now.

However, when now building an image, a new (subsequent) error comes up: it's rebuilding the database over and over again. It seems it's not really launching flask and fast API..., but "restarts" "always" (as in the last line of docker-compose.

Can you investigate?

When everything runs, please

hannesdatta commented 1 year ago

(I pushed my changes to the wip-29 branch.

DiSanchz commented 1 year ago

Thank you so much for the help @hannesdatta! We were really puzzled about it and did not realize this... With that solved and a few small mods on the docker-compose.yml everything is working, the apps can now be launched with docker.

I have just pushed the changes to the wip-29 branch and in a few moments I will open a PR with these. Besides the Dockerfile and the docker-compose.yml we have also added the instructions to launch the app with docker on the readme and reset n_users to its original value of 1000.

Regarding the docker-compose we have in the end take the approach of having two containers side to side to run the apps instead of employing the "supervisord" tool and as a consequence the supervisord.conf file has been removed from the repository.

Let us know if you would like us to add anything else or make any changes on the changes already committed!

hannesdatta commented 10 months ago

I have finalized the launch and tested it. Everything is now documented in readme.md and the new docs/ directory. Good job, everyone!