Xilinx / Vitis-AI

Vitis AI is Xilinx’s development stack for AI inference on Xilinx hardware platforms, including both edge devices and Alveo cards.
https://www.xilinx.com/ai
Apache License 2.0
1.47k stars 630 forks source link

Can't build docker image for TF2 and GPU/CPU #1403

Closed jonGuti13 closed 7 months ago

jonGuti13 commented 8 months ago

I am trying to build the Vitis AI 3.5 image for either GPU or CPU and tf2 by running the following command

./docker_build.sh -t cpu -f tf2 #For CPU

and it fails, as it can be seen in

Selección_060

This is a screenshot of my CI pipeline, updated with upstream as of 06/02/2024, which I have included in https://github.com/Xilinx/Vitis-AI/pull/1404.

I managed to build both images (CPU and GPU) two months ago, so something must have changed.

By the way, #1350 has the same problem except for the framework, which is pytorch for them.

jonGuti13 commented 7 months ago

I still get the error. Has anyone checked this? Thank you in advance.

vaibuild commented 7 months ago

Hi @jonGuti13, Could you please try with the latest commit from master branch, this issue has been solved.

jonGuti13 commented 7 months ago

Hi @vaibuild,

Thank you for your response, but I have tried it and it still fails for me. This a screenshot telling that my branch is ahead of Xilinx/Vitis-AI-master by three commits.

Selección_077

The first commit has to do with removing a confirm option for CI, the second one is the creation of the .github/worflows/ci.yml and the third one is changing the .github/worflows/ci.yml to build image for GPU.

Both actions have failed, although the reason of the one related to the GPU is not being enough space left on device, so I will just attach the screenshot of the failure of the CPU image (the one I was originally reporting in this issue).

Selección_076

vaibuild commented 7 months ago

Hi @jonGuti13, can you try with another server and directly from master? I cannot reproduce your problem, as for the coredump, maybe you can see if this helps? https://github.com/conda/conda/issues/7815 conda clean -a add it in install_tf2.sh #107 before mamba install command execute?

jonGuti13 commented 7 months ago

Hi @vaibuild,

Thank you for your rapid response. I have been able to build both GPU and CPU images locally, so the problem has been solved with the latest commit. However, I will also try to make it work in CI, as it is a valuable tool for checking compatibility issues. I will follow your piece of advice and try to fix it.

I close the issue as the problem has been solved, but I will still add information regarding the CI issue.

jonGuti13 commented 7 months ago

After adding conda clean -a in install_tf2.sh, it fails as a consequence of not being enough space left on device. However, I have removed some of the unnecesary packages of ubuntu-latest thanks to a Github Action from @jlumbroso https://github.com/marketplace/actions/free-disk-space-ubuntu and now the CPU image is succesfully built.