DSGT-DLP / Deep-Learning-Playground

Web Application where people new to Deep Learning can input a dataset and toy around with basic Pytorch modules without writing any code
MIT License
26 stars 8 forks source link

Upgraded to Poetry to Improve Dependency Management #863

Closed codingwithsurya closed 1 year ago

codingwithsurya commented 1 year ago

Upgraded to Poetry to Improve Dependency Management + Code Cleanup

What user problem are we solving? Currently, our Python backend strongly depends on Anaconda for environment and package management, resulting in a lack of separation between development and production environments. This creates challenges in maintaining and replicating the exact dependencies required for each environment.

What solution does this PR provide?

This pull request introduces the use of Poetry as a Python package manager tool to better define our development and production dependencies. By adopting Poetry, we can achieve better separation between the two environments and simplify the management of dependencies. In addition, this sped up build checks from 30mins to 5 mins. This migration also allows for venv caching as well.

The key changes in this PR include:

Created a pyproject.toml file in the poetry branch, which serves as the configuration file for Poetry.

Populated the pyproject.toml file with the necessary dependencies, both for development and production modes. The dependencies were inspired by the existing conda/environment.yml or requirements.txt files.

Updated the Python build check in the GitHub Actions workflow to use Poetry for environment setup. This ensures that the correct dependencies are installed during the CI/CD process.

Updated the Dockerfiles to use Poetry for environment setup. This ensures consistency between local development and deployment environments.

Ensured that installing Python dependencies updates the pyproject.toml file automatically. This eliminates the need to manually add dependencies to the conda/environment.yml file or requirements.txt.

Code Cleanup

Also some minor code cleanup occurred with the wiki.tsx file. The path names were wrong and directly needed the entire path when accessing the png file. So, import statements were deleted and path names were directly implemented in the code. Also the learn.tsx file was deleted as it was blank and not in use.

Testing Methodology How did you test your changes and verify that existing functionality is not broken

To test these changes and verify that existing functionality is not broken, the following steps were taken:

Checked out the poetry-setup branch locally and ensured that Poetry was installed.

Built and ran the backend application locally using Poetry for environment setup. Confirmed that the application started successfully and all existing functionality was working as expected.

Triggered the CI/CD pipeline using the GitHub Actions workflow and verified that the build process completed without errors.

updated Docker files with the new Poetry-based environment setup and tested using docker build.

Requested feedback and conducted code reviews with team members to validate the changes and address any potential issues.

Any other considerations

Update documentation and ReadME files + we need to gitignore poetry.lock

add pre-commit hooks in poetry to make sure poetry.lock and pyproject.toml are correctly in sync

codingwithsurya commented 1 year ago

no python-package-conda.yml action no requirements.txt no conda/environment.yml (conda deprecation) update installation script in package.json at the root update dockerfiles (train dockerfile + Dockerfile) update readme to describe how to get setup on poetry (and add resource on pyenv to get the right python version) add pre-commit hooks in poetry to make sure poetry.lock and pyproject.toml are correctly in sync pre-commit hooks of interest to add are: poetry-check, poetry-lock. Learn more How to build pre-commit hooks into github: Here

Precommit YAML file to add to: Here

make sure to add logger to the list of packages as we just added it

karkir0003 commented 1 year ago

poetry-setup branch is not in use anymore. please update pr description? @codingwithsurya

karkir0003 commented 1 year ago

you didnt actually deploy docker containers in your testing. all you did was build dockerfiles and run them with docker run

karkir0003 commented 1 year ago

youll also want to explain the purpose of cleaning up imports in wiki.tsx and remove learn.tsx

maybe these import cleanups should be its own mini pr for better tracking!

codingwithsurya commented 1 year ago

youll also want to explain the purpose of cleaning up imports in wiki.tsx and remove learn.tsx

maybe these import cleanups should be its own mini pr for better tracking!

Let's just put this in the title and make it clear that we did some code cleanup for better tracking. Otherwise, as u saw before, i'm going to have to re-do all my changes to poetry branch. And honestly, the wiki.tsx file is the only one that needs to be explained, since learn.tsx will be implemented soon in learning modules and is currently blank.

karkir0003 commented 1 year ago

so we are just lumping cleanup wirh poetry to save time? @codingwithsurya

codingwithsurya commented 1 year ago

so we are just lumping cleanup wirh poetry to save time? @codingwithsurya

yep. thats fine, right? we wanna get this out ASAP.

karkir0003 commented 1 year ago

so we are just lumping cleanup wirh poetry to save time? @codingwithsurya

yep. thats fine, right? we wanna get this out ASAP.

considering the complexity of the extra cleanup, you're good. you can keep working on this PR

karkir0003 commented 1 year ago

@codingwithsurya I tested your poetry setup on my end. I got it to work for windows! backend and frontend spin up perfectly fine.

karkir0003 commented 1 year ago

@farisdurrani this PR can be merged contingent on the following

  1. My PR gets merged into this branch
  2. You test the poetry setup on your branch
karkir0003 commented 1 year ago

I had to run the command poetry env use $(pyenv which python) to get poetry to use the version of python that I installed through pyenv to match the toml file python version requirements, anyone else?

so you need to use the latest version of python 3.9 in order for poetry install to work @dwu359

dwu359 commented 1 year ago

I had to run the command poetry env use $(pyenv which python) to get poetry to use the version of python that I installed through pyenv to match the toml file python version requirements, anyone else?

so you need to use the latest version of python 3.9 in order for poetry install to work @dwu359

So I ran

pyenv install 3.9
pyenv local 3.9
yarn startb

And that gave me an error unless I created a new poetry virtual env specifying the specific pyenv install for python 3.9

dwu359 commented 1 year ago

I also ran into an error when building the docker image from Dockerfile: Type error: '"@/features/Train/redux/trainspaceApi"' has no exported member named 'useGetColumnsFromDatasetQuery'. Did you mean 'useLazyGetColumnsFromDatasetQuery'? Someone probably forgot to change the names to match in a previous pr, shouldn't be hard to fix This error was caused by the ImageParametersStep btw

dwu359 commented 1 year ago

Would it be a good idea to have two separate poetry environments, one for the backend, and one for backendCore btw? We already have two separate docker files for each

karkir0003 commented 1 year ago

Would it be a good idea to have two separate poetry environments, one for the backend, and one for backendCore btw? We already have two separate docker files for each

that would be a good idea. if we know what deps are used on backend/ vs. backendCore, we can make the builds probably more optimized?

farisdurrani commented 1 year ago

@codingwithsurya can you please add poetry shell at the beginning of the yarn run installb instruction?

karkir0003 commented 1 year ago

@codingwithsurya can you please add poetry shell at the beginning of the yarn run installb instruction?

added it in. Can you check @farisdurrani

karkir0003 commented 1 year ago

I also ran into an error when building the docker image from Dockerfile: Type error: '"@/features/Train/redux/trainspaceApi"' has no exported member named 'useGetColumnsFromDatasetQuery'. Did you mean 'useLazyGetColumnsFromDatasetQuery'? Someone probably forgot to change the names to match in a previous pr, shouldn't be hard to fix This error was caused by the ImageParametersStep btw

I also ran into an error when building the docker image from Dockerfile: Type error: '"@/features/Train/redux/trainspaceApi"' has no exported member named 'useGetColumnsFromDatasetQuery'. Did you mean 'useLazyGetColumnsFromDatasetQuery'? Someone probably forgot to change the names to match in a previous pr, shouldn't be hard to fix This error was caused by the ImageParametersStep btw

updated that @dwu359

codingwithsurya commented 1 year ago

as of saturday evening, everything works except docker builds/runs. The dockerfiles currently committed are a draft and need to be changed. Need some help on this. @karkir0003 @farisdurrani @dwu359

also for some reason my docker builds take insanely long (15-20 mins) so need to figure out how to speed up the process.

dwu359 commented 1 year ago

as of saturday evening, everything works except docker builds/runs. The dockerfiles currently committed are a draft and need to be changed. Need some help on this. @karkir0003 @farisdurrani @dwu359

also for some reason my docker builds take insanely long (15-20 mins) so need to figure out how to speed up the process.

What happens when you try to build a docker image?

karkir0003 commented 1 year ago

as of saturday evening, everything works except docker builds/runs. The dockerfiles currently committed are a draft and need to be changed. Need some help on this. @karkir0003 @farisdurrani @dwu359 also for some reason my docker builds take insanely long (15-20 mins) so need to figure out how to speed up the process.

What happens when you try to build a docker image?

same. I tried making some updates/fixes to Dockerfile. The build takes pretty long, but the image size is like 12 GB on docker desktop. was wondering how we can optimize this. Thoughts @dwu359

codingwithsurya commented 1 year ago

as of saturday evening, everything works except docker builds/runs. The dockerfiles currently committed are a draft and need to be changed. Need some help on this. @karkir0003 @farisdurrani @dwu359 also for some reason my docker builds take insanely long (15-20 mins) so need to figure out how to speed up the process.

What happens when you try to build a docker image?

So TrainingDockerFile built just fine but for the regular dockerfile we initially were facing errors with the path in yarn run secrets command. We fixed that and we were trying to re-test but dockerbuild was taking sooooo long (like around 20 mins or so). We were already 4-5 hours into working so at that point we were kinda tired and didn't keep trying to see if the test worked. I wonder if docker runs faster on your guys's machines.

karkir0003 commented 1 year ago

as of saturday evening, everything works except docker builds/runs. The dockerfiles currently committed are a draft and need to be changed. Need some help on this. @karkir0003 @farisdurrani @dwu359 also for some reason my docker builds take insanely long (15-20 mins) so need to figure out how to speed up the process.

What happens when you try to build a docker image?

So TrainingDockerFile built just fine but for the regular dockerfile we initially were facing errors with the path in yarn run secrets command. We fixed that and we were trying to re-test but dockerbuild was taking sooooo long (like around 20 mins or so). We were already 4-5 hours into working so at that point we were kinda tired and didn't keep trying to see if the test worked. I wonder if docker runs faster on your guys's machines.

@codingwithsurya I was able to get Dockerfile to build on my machine. Had to make some minor fixes

karkir0003 commented 1 year ago

I'm testing docker build -t train -f TrainingContainer.Dockerfile . ... @codingwithsurya

codingwithsurya commented 1 year ago

as of saturday evening, everything works except docker builds/runs. The dockerfiles currently committed are a draft and need to be changed. Need some help on this. @karkir0003 @farisdurrani @dwu359 also for some reason my docker builds take insanely long (15-20 mins) so need to figure out how to speed up the process.

What happens when you try to build a docker image?

So TrainingDockerFile built just fine but for the regular dockerfile we initially were facing errors with the path in yarn run secrets command. We fixed that and we were trying to re-test but dockerbuild was taking sooooo long (like around 20 mins or so). We were already 4-5 hours into working so at that point we were kinda tired and didn't keep trying to see if the test worked. I wonder if docker runs faster on your guys's machines.

@codingwithsurya I was able to get Dockerfile to build on my machine. Had to make some minor fixes

oh nice man! what were the changes?

karkir0003 commented 1 year ago

lemme push it real quick. minor ordering of the commands

karkir0003 commented 1 year ago

now the catch was that i didnt do a docker run. it took so long to see if it worked on my end.

codingwithsurya commented 1 year ago

now the catch was that i didnt do a docker run. it took so long to see if it worked on my end.

so both trainingdockerfile and dockerfile builds work now?

codingwithsurya commented 1 year ago

now the catch was that i didnt do a docker run. it took so long to see if it worked on my end.

yea no worries , it takes ages. docker run is pretty slow too. might just put a movie on to pass the time lol

karkir0003 commented 1 year ago

@codingwithsurya pushed some of my changes/fixes for poetry

codingwithsurya commented 1 year ago

@codingwithsurya pushed some of my changes/fixes for poetry

awesome. good thought on switching to root directory right before COPY ..

and all the consecutive commands cd to backend

karkir0003 commented 1 year ago

@codingwithsurya pushed some of my changes/fixes for poetry

awesome. good thought on switching to root directory right before COPY ..

and all the consecutive commands cd to backend

can you give the docker build a try for Dockerfile and make sure the docker run works as intended. Also, im trying to test docker build for TrainingContainer.Dockerfile. Getting some frustrating connection pool timed out error from pypy when the poetry install is running

karkir0003 commented 1 year ago

@codingwithsurya i got TrainingContainer.Dockerfile to build successfully.

karkir0003 commented 1 year ago

@dwu359 potential tool for finding unused deps for our pyproject.toml files

https://github.com/fpgmaas/deptry

karkir0003 commented 1 year ago

@dwu359 potential tool for finding unused deps for our pyproject.toml files

https://github.com/fpgmaas/deptry

https://fpgmaas.github.io/deptry/rules-violations/

dwu359 commented 1 year ago

@dwu359 potential tool for finding unused deps for our pyproject.toml files

https://github.com/fpgmaas/deptry

Hmm yeah it does seem like something that would be neat to add to one of our GitHub workflows

karkir0003 commented 1 year ago

@dwu359 potential tool for finding unused deps for our pyproject.toml files https://github.com/fpgmaas/deptry

Hmm yeah it does seem like something that would be neat to add to one of our GitHub workflows

not sure if this library is good enough. I was trying to see if it has decent github star count + fairly recent commits to see if the repo is actively maintained