AlexsLemonade / refinebio

Refine.bio harmonizes petabytes of publicly available biological data into ready-to-use datasets for cancer researchers and AI/ML scientists.
https://www.refine.bio/
Other
128 stars 19 forks source link

Docker builds failing when built via GHA #3484

Open davidsmejia opened 2 months ago

davidsmejia commented 2 months ago

Context

Building docker imagine fails on GHA but not locally. This is preventing deploys and debugging processing on staging. The exact same build steps succeed locally.

Problem or idea

Failure:

 #10 2.278 WARNING: Ignoring version 6.5.0 of django-elasticsearch-dsl since it has invalid metadata:
#10 2.278 Requested django-elasticsearch-dsl==6.5.0 from https://files.pythonhosted.org/packages/1b/02/a0f1eae33da9ea0509cac63ae3eb239b4228cd6898fcd75df570ccd72814/django_elasticsearch_dsl-6.5.0-py2.py3-none-any.whl (from -r requirements.txt (line 18)) has invalid metadata: Expected matching RIGHT_PARENTHESIS for LEFT_PARENTHESIS, after version specifier
#10 2.278     elasticsearch-dsl (>=6.4.0<7.0.0)
#10 2.278                       ~~~~~~~~^
#10 2.278 Please use pip<24.1 if you need to use this version.

It looks like this particular metadata version is malformed, additionally, I have seen a different error for a different package that isn't appearing in the latest execution on the build step.

Solution or next step

davidsmejia commented 3 weeks ago

At this time I don't believe that caching is enough to entirely explain the weirdness going on here.

Previously, we were using django-elasticsearch-dsl version 6.5.0. It seems that 6.5.0 has invalid metadata when being downloaded by pip. It uses a notation that is understood by a subsequent version of pip.

This can be resolved locally by making the version required less restrictive and ultimately letting it resolve to 6.4.2. However, this issue still persists on GHA when building the images.

I was able to recreate and resolve this locally by performing the following steps locally.

Note: I use pyenv to manage multiple versions of python for various projects. Also, It was important to switch python versions before removing the dr_env folder and recreating as there are no build errors when starting with version 3.10 that I was able to see.

pyenv global 3.8
rm -r dr_env
./scripts/create_virtualenv.sh
source dr_env/bin/activate
cd api
pip-compile --annotation-style=line requirements.in

Next steps:

I think the next step is to try and successfully build with 3.8 and version 6.5.0 locally.