kubeedge / sedna

AI tookit over KubeEdge
https://sedna.readthedocs.io
Apache License 2.0
503 stars 162 forks source link

Example4:Using Federated Learning Job in YoLov5-based Object Detection. has a bug #427

Open dsj-kaiyue opened 8 months ago

dsj-kaiyue commented 8 months ago

What happened: 当我尝试运行sedba的第四个例子时,我创建了一个联邦学习的任务,可是我的容器运行不起来。 image

使用kubectl logs查看容器的日志如下 image image 这里有一个重要的信息 AttributeError: 'NoneType' object has no attribute 'replace' 我觉得是容器的原始镜像就存在问题。

What you expected to happen: 我希望能跑通这个例子,但是我没有办法解决这个问题。 我在尝试docker镜像4.0,4.3都遇到了问题。 How to reproduce it (as minimally and precisely as possible): 正常安装6.0版本的sedna,使用5.0的docker镜像就可以复现出这个问题。 Anything else we need to know?:

Environment:

Sedna Version ```console $ kubectl get -n sedna deploy gm -o jsonpath='{.spec.template.spec.containers[0].image}' # paste output here kubeedge/sedna-gm:v0.6.0 $ kubectl get -n sedna ds lc -o jsonpath='{.spec.template.spec.containers[0].image}' # paste output here kubeedge/sedna-lc:v0.6.0 ```
Kubernets Version ```console $ kubectl version # paste output here Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.8", GitCommit:"5575935422cc1cf5169dfc8847cb587aa47bac5a", GitTreeState:"clean", BuildDate:"2021-06-16T13:00:45Z", GoVersion:"go1.15.13", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.8", GitCommit:"5575935422cc1cf5169dfc8847cb587aa47bac5a", GitTreeState:"clean", BuildDate:"2021-06-16T12:53:07Z", GoVersion:"go1.15.13", Compiler:"gc", Platform:"linux/amd64"} ```
KubeEdge Version ```console $ cloudcore --version # paste output here KubeEdge v1.10.0 $ edgecore --version # paste output here KubeEdge v1.10.0 ```

CloudSide Environment:

Hardware configuration ```console $ lscpu # paste output here Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 2 On-line CPU(s) list: 0,1 Thread(s) per core: 1 Core(s) per socket: 1 座: 2 NUMA 节点: 1 厂商 ID: AuthenticAMD CPU 系列: 25 型号: 80 型号名称: AMD Ryzen 7 5800H with Radeon Graphics 步进: 0 CPU MHz: 3193.919 BogoMIPS: 6387.83 超管理器厂商: VMware 虚拟化类型: 完全 L1d 缓存: 32K L1i 缓存: 32K L2 缓存: 512K L3 缓存: 16384K NUMA 节点0 CPU: 0,1 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext pdpe1gb rdtscp lm constant_tsc art rep_good nopl tsc_reliable nonstop_tsc extd_apicid eagerfpu pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cr8_legacy abm sse4a misalignsse osvw topoext retpoline_amd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 clzero arat umip vaes vpclmulqdq overflow_recov succor ```
OS ```console $ cat /etc/os-release # paste output here NAME="CentOS Linux" VERSION="7 (Core)" ID="centos" ID_LIKE="rhel fedora" VERSION_ID="7" PRETTY_NAME="CentOS Linux 7 (Core)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:centos:centos:7" HOME_URL="https://www.centos.org/" BUG_REPORT_URL="https://bugs.centos.org/" CENTOS_MANTISBT_PROJECT="CentOS-7" CENTOS_MANTISBT_PROJECT_VERSION="7" REDHAT_SUPPORT_PRODUCT="centos" REDHAT_SUPPORT_PRODUCT_VERSION="7" ```
Kernel ```console $ uname -a # paste output here ```
Others

EdgeSide Environment:

Hardware configuration ```console $ lscpu # paste output here ```
OS ```console $ cat /etc/os-release # paste output here ```
Kernel ```console $ uname -a # paste output here ```
Others
letweare commented 5 months ago

一样,我也是这个问题,你最后解决了么

dsj-kaiyue commented 5 months ago

一样,我也是这个问题,你最后解决了么

hello,出现这个问题是因为Sedna提供的docker镜像有问题,需要自己根据pod日志逐一修改,大多是方法名字的书写错误。

letweare commented 5 months ago

可以给个联系方式,请教一些问题吗