Dockerize and fix installation on linux

fani-lab / LADy

LADy 💃: A Benchmark Toolkit for Latent Aspect Detection Enriched with Backtranslation Augmentation

Other

3 stars 3 forks source link

Dockerize and fix installation on linux #49

Closed 3ripleM closed 9 months ago

3ripleM commented 10 months ago

Based on this issue on spacy repository I had to add typing_extensions to our libraries in order to setup the project

Issue Page: Pydantic issubclass error for python 3.8 and 3.9

You should add this line in the requirement.txt typing_extensions==4.4.0

hosseinfani commented 10 months ago

@3ripleM @farinamhz no problem. but also put a quick comment in the requirement.txt. Also, update the environment.yml, if needed. Tnx.

3ripleM commented 10 months ago

I also dockerize the setup process so its now easy to reproduce on any machine :) and also we can publish the docker image to a private container registry to make it available for anyone who wants to run the codebase (with all the requirements already installed)

something like this:

docker run fani_lab/lady --volume ./output:/app/output \
    python main.py \
    -naspects 5 \
    -am rnd \
    -data ../data/raw/semeval/toy.2016SB5/ABSA16_Restaurants_Train_SB1_v2.xml \
    -output ../output/toy.2016SB5/

farinamhz commented 10 months ago

Thank you very much @3ripleM and welcome to LADy :)

hosseinfani commented 10 months ago

@3ripleM how can we have a container registry? why not a public container registry?

3ripleM commented 10 months ago

@hosseinfani Yeah, we can have a public one. It was just only a suggestion. but in either case it would be great to have one.

For setting it up I think we can use github registry (packages) for that. we have two options:

Manually push to the registry (which means that we have to make the image on a local computer)
Automatic build process (which relies on github action to do image building and pushing for us)

for the second option we can decide to do it automatically for us when we push to the master branch or we can do it manually trigger the pipeline via a button

hosseinfani commented 10 months ago

@3ripleM can we go with the second option and automatic build, but also running a quick test on a toy dataset? if so, would you please handle that?

3ripleM commented 10 months ago

@hosseinfani Yes, I believe we can build on multi stage, means when the build is finished we can test our image by running it on the cloud to test that every thing is working. After that, we can push the image to the registry.

I'll keep you updated

3ripleM commented 10 months ago

@hosseinfani Hossein, our current method of processing toy datasets consumes a significant amount of time and resources.

I would like to propose a solution to address this issue. We could consider releasing two versions of our model:

Full Version: This version would require the use of a separate system equipped with a GPU.
Lite Version

Furthermore, I am of the opinion that we could explore the possibility of setting up a GitHub Actions runner on our servers that have GPUs. This approach could potentially streamline and automate the process. However, further investigation is necessary to determine the feasibility of this solution.

hosseinfani commented 10 months ago

@3ripleM

really? it's a toy dataset. it shouldn't take a lot of time. let's discuss it in person.
we don't have any internal GPU server, nor we should think of having that. We need to provide a unit test on any pcs.

hosseinfani commented 10 months ago

@3ripleM Please prepare two guideline as a md file for our library: 1- From producer's perspective to create dockerization 2- From the consumer perspective how to use the docker

3ripleM commented 9 months ago

@hosseinfani @farinamhz

Dockerization is done and the readme is updated.

farinamhz commented 9 months ago

Great, thank you very much, @3ripleM!

hosseinfani commented 9 months ago

@3ripleM thank you. pls close the issue then.