NVIDIA / data-science-stack

NVIDIA Data Science stack tools
Apache License 2.0
381 stars 55 forks source link

build-container using data-science-stack-pinned.yaml fails with conflicts and unable to create new pin.yaml #135

Open rezroo opened 9 months ago

rezroo commented 9 months ago

Exec of "./data-science-stack build-container" fails after 24 hours with 100s of python conflicts reported by conda:

9 ERROR: process "/bin/sh -c ${CONDA_ROOT}/bin/conda env create -n data-science-stack-${STACK_VERSION} -f /environment-pinned.yaml" did not complete successfully: exit code: 1

So an attempt was made to create a new dependency version by running "./data-science-stack pin data-science-stack", however after running for 84 hours at 100% CPU conda is still not done. There are no messages on stdout other than 4 warnings printed at the start more than 3 days ago:

$ ./data-science-stack pin data-science-stack

NV### Sat 02 Mar 2024 10:18:07 PM PST #### START Pinning versions to data-science-stack-pinned.yaml

python3.7: Pulling from frolvlad/alpine-miniconda3 Digest: sha256:d0f3f7eb69fda9d203e72ec162a1a2813993d9c791efe9d6bcf397d6858fd4b1 Status: Image is up to date for frolvlad/alpine-miniconda3:python3.7 docker.io/frolvlad/alpine-miniconda3:python3.7

NV### Sat 02 Mar 2024 10:18:08 PM PST #### conda create -v --name snapper -c rapidsai -c conda-forge -c nvidia -c pytorch -c defaults "python=3.10" "cudatoolkit=11.8" "rapids=23.04" conda-forge::adlfs conda-forge::cffi conda-forge::cmake conda-forge::dask-glm conda-forge::dask-kubernetes conda-forge::dask-jobqueue conda-forge::dask-labextension conda-forge::dask-ml conda-forge::fastparquet conda-forge::flatbuffers conda-forge::gcsfs conda-forge::hypothesis conda-forge::ipython conda-forge::ipyvolume conda-forge::ipywidgets conda-forge::jupyterlab rapidsai::jupyterlab-nvdashboard conda-forge::lapack conda-forge::lime conda-forge::matplotlib conda-forge::networkx conda-forge::nltk conda-forge::nodejs conda-forge::opencv conda-forge::pytest conda-forge::python-graphviz conda-forge::python-snappy "conda-forge::pytorch>=1.8" conda-forge::rapidjson conda-forge::seaborn conda-forge::statsmodels conda-forge::tensorflow pytorch::torchvision --dry-run

NV### Sat 02 Mar 2024 10:18:08 PM PST #### This will take a while...

INFO conda.gateways.repodata:conda_http_errors(203): Unable to retrieve repodata (response: 404) for https://conda.anaconda.org/rapidsai/linux-64/current_repodata.json WARNING conda.models.version:get_matcher(546): Using . with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.7.1., but conda is ignoring the . and treating it as 1.7.1 WARNING conda.models.version:get_matcher(546): Using . with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.9.0., but conda is ignoring the . and treating it as 1.9.0 WARNING conda.models.version:get_matcher(546): Using . with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.8.0., but conda is ignoring the . and treating it as 1.8.0 WARNING conda.models.version:get_matcher(546): Using . with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.6.0., but conda is ignoring the . and treating it as 1.6.0

How long is "a while"? If it takes more than the 3 days it has taken already then we need the maintainers to step in and address this issue for the community so the repository is usable again.

rezroo commented 9 months ago

Is this repo even maintained by anyone? I expected a bit more - like a simple acknowledgement - from a $2T company.