I got the following log when running a distributed Tensorflow training job. In socket_fd_api::add_epoll_context function, there is a comment "Currently VMA does not support more then 1 epfd listed". Is there any plan to support that feature? Thanks!
[0m[0m Pid: 29071 Tid: 29218 VMA DEBUG: epfd_info:263:add_fd() epoll_ctl: fd=34 is already registered with another epoll instance 30, cannot register to epoll 32 (errno=12 Cannot allocate memory)
I got the following log when running a distributed Tensorflow training job. In socket_fd_api::add_epoll_context function, there is a comment "Currently VMA does not support more then 1 epfd listed". Is there any plan to support that feature? Thanks!