Open jin-eld opened 11 months ago
Before running the training code, try this
!pip uninstall -y keras tensorflow tensorflow-probability absl-py astunparse flatbuffers gast google-pasta grpcio h5py keras keras-preprocessing libclang numpy opt-einsum protobuf setuptools six tensorboard tensorflow-io-gcs-filesystem termcolor tf-estimator-nightly typing-extensions wrapt !pip install --disable-pip-version-check --no-cache-dir tensorflow==2.11.0 !pip install tensorflow-probability==0.15.0 !pip install keras==2.11.0
@yk7244 Are you successfully training your model with these versions of the listed packages?
Some associates at University and I are trying to use this lib for a project and are only getting degenerated results, despite having training for many hours, with datasets which are normalized and to all appearances perfect for the task. We tried different sample rates and different parameters, but nothing seems to be remotely usable
Same issue it only ran in CPU mode so it took many hours, however, I solved this by adding one more before running the codes, Go to tools-command palette-use fallback runtime version(you should scroll down a bit) and run the codes. In summary,
- Change to fall back runtime version
!pip uninstall -y keras tensorflow tensorflow-probability absl-py astunparse flatbuffers gast google-pasta grpcio h5py keras keras-preprocessing libclang numpy opt-einsum protobuf setuptools six tensorboard tensorflow-io-gcs-filesystem termcolor tf-estimator-nightly typing-extensions wrapt !pip install --disable-pip-version-check --no-cache-dir tensorflow==2.11.0 !pip install tensorflow-probability==0.15.0 !pip install keras==2.11.0
!pip install --upgrade ddsp
run the original training shell(long one) It seems the problem is due to the update of CUDA version in colab default the updated CUDA is 12 but the tensorflow version that DDSP supports uses 11. So you should use fall back runtime version to go back to the older version
This should work
@mightimatti
@yk7244 Thank you, I will give it a try with these specific versions. I wish there were a known, good Dataset available to test if the model I'm training reproduces their results. There are just so many variables in ML training, that it's really hard to troubleshoot, if you're not even confident your techstack is performing as expected.
@mightimatti Once you use in GPU mode (check that you set the GPU runtime) it only takes 10-20 min. The environment setting for AI, Machine learning is really pain in the ass.... Every version of the requirements should be met... However, in my opinion, that's the beauty of these AI technology since we can expect or anticipate very little of the result. It's like a surprise :)
@mightimatti I modified the comment a bit for the solution. Please check
@yk7244 I switched to a venv setup, so uninstalling anything is pointless in my case - I start with a clean environment. I went back to Python 3.7 and I think I did tune the requirements.txt a bit (will check tonight and report back). In the end I manage to install it in the 3.7 venv.
By the way, the colab did not work for me, it errored out at some point, but I did not follow up on it. I was hoping to download the pretarined models from the colab space...
I was also wondering if there are any existing pretrained models to use with ddsp as I could not find anything to download?
The environment setting for AI, Machine learning is really pain in the ass.... Every version of the requirements should be met... However, in my opinion, that's the beauty of these AI technology since we can expect or anticipate very little of the result. It's like a surprise :)
@yk7244 I totally agree about the PITA point, it's literally a mess in pretty much every AI/ML project. I think those tons of incompatible Python dependencies add a lot to this mess and on top of that - in my case: throw ROCm into the mix, that's some fun on top of it all, especially for projects which were not written with ROCm support in mind.
I think another aspect that contributes to it: the AI/ML folks are primarily scientists and while they are really good at the scientific part of it, I see a lack of software engineering skills in many projects which makes the code difficult to understand and maintain. Often code is thrown out to show that a paper works, but it's then not properly picked up by actual SW engineers who would sit down and rewrite it into a nicely designed software. I hope this issue to settle at some point when different parts of the community start to collaborate a bit better and of course one has to have time to actually do this :)
I come from the SW side of things and to be fair, I did not contribute much either at this point, partially because a lack of time, but also simply because I did not yet find a project to focus on. Trying out those many, many exciting things with varying success and moving on to the next, hoping to get better results, just so many AI projects are out there now :)
Hi, so I actually came up with another solution whcih should be somewhat future proof. This is assuming you have access to hardware to run this on. I wrote a dockerfile that downloads an old docker-image of collab, which features the supported runtime and installs the neccessary dependencies. One you run this you can navigate to collab and select to connect to a local runtime. This is the dockerfile:
FROM europe-docker.pkg.dev/colab-images/public/runtime:release-colab_20230921-060057_RC00
RUN pip uninstall -y keras tensorflow tensorflow-probability absl-py astunparse flatbuffers gast google-pasta grpcio h5py keras keras-preprocessing libclang numpy opt-einsum protobuf setuptools six tensorboard tensorflow-io-gcs-filesystem termcolor tf-estimator-nightly typing-extensions wrapt
RUN pip install --disable-pip-version-check --no-cache-dir tensorflow==2.11.0
RUN pip install tensorflow-probability==0.15.0
RUN pip install keras==2.11.0
RUN pip install crepe==0.0.12
RUN pip install ddsp[data_preparation]==3.6.0
I saved this to a file called Dockerfile
and then ran
docker build -t colab_local - <Dockerfile
This will take a LONG time, as the Dockerimage is 12GB.
after this, and assuming you have all your NVIDIA drivers and permissions set up to allow docker to access your GPU(I hadn't!), you can spin up a docker instance running collab which utilizes your local GPUs like this.
docker run -p 127.0.0.1:9000:8080 --gpus all colab_local
Within this collab instance you should be able to run everything as intended by the notebooks. If you're storing your checkpoints etc to disk, you might need to map a docker directory to a host directory.
Hi,
it seems I have a similar issue like the one described here: https://github.com/magenta/ddsp/issues/376
I am on Fedora release 38 and I ran "pip install --upgrade ddsp" as described in the installation section. At some point pip started to print a warning and it began to download old versions of packages, for example:
Same happens for many other dependencies as well. Any idea how to get past this?
The old issue was referring to some specific and older Python version, currently I have Python 3.11.6, is this supported or am I supposed to install a specific one?
Kind regards, Jin