PaddlePaddle / X2Paddle

Deep learning model converter for PaddlePaddle. (『飞桨』深度学习模型转换工具)
http://www.paddlepaddle.org/
Apache License 2.0
740 stars 165 forks source link

[PaddleV3] 修改 Dockerfile 为 Paddle 3.0.0beta 并添加 CI 测试的 blacklist #1061

Closed megemini closed 1 month ago

megemini commented 1 month ago

Create A Good Pull Request

  1. 修改 Dockerfile 为 Paddle 3.0.0beta,PyTorch ONNX 等一并改为最新的版本
  2. test_benchmark 使用 black.list 过滤测试
    • 默认所有都不测试
    • 后面每次修改模型,black.list 中删掉对应项,CI 中对其进行测试

Dockfile 我本地构建没啥问题:


λ 483e70b23ef6 /home python
Python 3.9.18 (main, Aug 25 2023, 13:20:04) 
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import paddle
grep: warning: GREP_OPTIONS is deprecated; please use an alias or script
>>> import torch
>>> import tensorflow
2024-10-09 06:23:12.329805: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-10-09 06:23:12.360176: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-10-09 06:23:12.815496: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/usr/lib/python3/dist-packages/requests/__init__.py:89: RequestsDependencyWarning: urllib3 (2.0.7) or chardet (3.0.4) doesn't match a supported version!
  warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
>>> import onnx
>>> paddle.__version__
'3.0.0-beta1'
>>> torch.__version__
'2.4.1+cu121'
>>> tensorflow.__version__
'2.16.1'
>>> onnx.__version__
'1.17.0'

有个小问题,基础 docker 里面的 python 是 3.9 ,不是 Paddle 支持的最低 3.8 ~ 不过问题也不大,我这里就沿用了 ~

但是,Caffe 没有在 Dockerfile 的配置中,这个是咋配置的?

CI 里面我看 PyTorch 跟其他几个是分开测试的,不太清楚我这里脚本有木有问题,先提交一下看看吧 ~

另外,后面修改的时候,不能保证上面所有框架的最新版本都能通过,中间如果实在适配困难,可能需要再修改一下 Dockerfile ~

关联:https://github.com/PaddlePaddle/X2Paddle/issues/1060

@luotao1 请评审 ~

下面的文字请保留在PR说明的最后面,并在提完PR后,根据实际情况勾选确认以下情况

Please check the follow step before merging this pull request

If this PR add new model support, please update model_zoo.md and add model to out test model zoos(@luotao1 )

megemini commented 1 month ago

Update 20241011

megemini commented 1 month ago

Update 20241016

CI 服务器上的 cuda 应该是 11.2 ,参考之前的日志:

2024-10-13 20:08:04 LD_LIBRARY_PATH=/usr/local/cuda-11.2/targets/x86_64-linux/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64

麻烦再构建一下镜像试试吧 😅😅😅

@luotao1

luotao1 commented 1 month ago

可以回退到上一个commit,路径里export下即可 export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.8/compat

镜像我先不重新生成了,我在CI配置里加这句