chao1224 / MoleculeSTM

Multi-modal Molecule Structure-text Model for Text-based Editing and Retrieval, Nat Mach Intell 2023 (https://www.nature.com/articles/s42256-023-00759-6)
https://chao1224.github.io/MoleculeSTM
Other
188 stars 18 forks source link

Docker ERROR: failed to receive status: rpc error #15

Closed cankobanz closed 5 months ago

cankobanz commented 6 months ago

Hi, thank you for your work.

I aim to use MegaMolBART for encoding and decoding on my work. Therefore, I tried to build required environment using provided Docker file but getting this error:

 => [22/22] RUN cd /tmp/apex/ && pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./ 
 => => #     g++ -pthread -shared -B /opt/conda/compiler_compat -L/opt/conda/lib -Wl,-rpath=/opt/conda/lib -Wl,--no-as-needed -Wl,--sysroot=/ /tmp/pip-req-build-9per1b4n/build/temp.linux-x86_64-3.7/csrc/flatten_unflatten.o -L/opt/c
 => => # onda/lib/python3.7/site-packages/torch/lib -lc10 -ltorch -ltorch_cpu -ltorch_python -o build/lib.linux-x86_64-3.7/apex_C.cpython-37m-x86_64-linux-gnu.so                                                                       
 => => #     building 'amp_C' extension                                                                                                                                                                                                 
 => => #     Emitting ninja build file /tmp/pip-req-build-9per1b4n/build/temp.linux-x86_64-3.7/build.ninja...                                                                                                                           
 => => #     Compiling objects...                                                                                                                                                                                                       
 => => #     Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)                                                                                                          
ERROR: failed to receive status: rpc error: code = Unavailable desc = error reading from server: EOF
PS D:\pythonProject\ConGen> sudo systemctl status docker   # For systems using systemd

Do you have any idea why this occurs? If not, given that I specifically need to utilize MegaMolBART, do you have any recommendations on how can I achieve that?

chao1224 commented 6 months ago

Hi @cankobanz,

This seems to be the issue with apex. Now what I provide is:

RUN cd /tmp && git clone https://github.com/chao1224/apex.git
RUN cd /tmp/apex/ && pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

I changed the apex repo to align with my cuda and torch version. If you are using different versions, then you may as well check the latest apex repo here.