scipion-em / scipion-em-continuousflex

Plugin for continuous conformational flexibility analysis containing HEMNMA, StructMap, HEMNMA-3D, TomoFlow, NMMD, and DeepHEMNMA for in vitro and in situ cryo-EM/ET.
GNU General Public License v3.0
6 stars 2 forks source link

TomoFlow Issues #170

Closed DcShepherd closed 1 year ago

DcShepherd commented 1 year ago

Hello, I am currently having some issues installing and using the continuousflex plugin, specifically TomoFlow

My first issue occurs during installation using scipion3 installp -p scipion-em-continuousflex --devel after cloning the repository. During installation there is an error saying pycuda and farneback3d cannot be installed. I can get around this by installing pycuda using conda/mamba and installing farneback3d using pip3 install farneback3d

However, when testing TomoFlow by running scipion tests continuousflex.tests.test_workflow_TomoFlow.TestTomoFlow I get the following error.

* last 50 lines of STD ERR ***

/home/doulin/clones/scipion-em-continuousflex/continuousflex/protocols/utilities/pdb_handler.py:64: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison atomNum[np.where(atomNum == "****")[0]] = "-1" Traceback (most recent call last): File "/home/doulin/miniconda3/envs/scipion3/lib/python3.8/site-packages/pyworkflow/protocol/protocol.py", line 201, in run self._run() File "/home/doulin/miniconda3/envs/scipion3/lib/python3.8/site-packages/pyworkflow/protocol/protocol.py", line 252, in _run resultFiles = self._runFunc() File "/home/doulin/miniconda3/envs/scipion3/lib/python3.8/site-packages/pyworkflow/protocol/protocol.py", line 248, in _runFunc return self._func(self._args) File "/home/doulin/clones/scipion-em-continuousflex/continuousflex/protocols/protocol_pdb_dimred.py", line 152, in readInputFiles numpyArr2dcd(pdbs_arr, self._getExtraPath("coords.dcd")) File "/home/doulin/clones/scipion-em-continuousflex/continuousflex/protocols/utilities/genesisutilities.py", line 189, in numpyArr2dcd nframe, natom, = arr.shape ValueError: not enough values to unpack (expected 3, got 1) Protocol failed: not enough values to unpack (expected 3, got 1)

* end of STD ERR ***

[ FAILED ] TestTomoFlow.test_all

Traceback (most recent call last): File "/home/doulin/miniconda3/envs/scipion3/lib/python3.8/unittest/case.py", line 60, in testPartExecutor yield File "/home/doulin/miniconda3/envs/scipion3/lib/python3.8/unittest/case.py", line 676, in run self._callTestMethod(testMethod) File "/home/doulin/miniconda3/envs/scipion3/lib/python3.8/unittest/case.py", line 633, in _callTestMethod method() File "/home/doulin/clones/scipion-em-continuousflex/continuousflex/tests/test_workflow_TomoFlow.py", line 78, in test_all self.launchProtocol(protpdbdimred) File "/home/doulin/miniconda3/envs/scipion3/lib/python3.8/site-packages/pyworkflow/tests/tests.py", line 116, in launchProtocol raise Exception("Protocol %s execution failed. See last log lines above for more details." % prot.getRunName()) Exception: Protocol PCA on groundtruth PDBs execution failed. See last log lines above for more details.

Any help solving this issue would be greatly appreciated.

MohamadHarastani commented 1 year ago

Hello @DcShepherd Thanks for reporting. I know that pycuda is problematic to install on some systems, I am working towards using a different library for optical flow calculations but it will take me some time. Anyway, I couldn't reproduce these issues of installation and testing. It might also be Xmipp installation issues since we have dependecy on Xmipp during the NMA analysis leading to simulated movements in the test.

During the installation, farneback-3d fails to install at the beginning, but after another auto-attempt (after downgrading setuptools) it installs.

Could you try the following? After cloning, cd into the cloned directory, make sure you are on devel branch, reinstall the plugin and relaunch the test:

git clone https://github.com/scipion-em/scipion-em-continuousflex.git
cd scipion-em-continuousflex
git checkout devel && git pull
scipion3 uninstallp -p scipion-em-continuousflex
scipion3 installp -p . --devel
scipion3 tests --grep tomoflow --run

If the problem persists, try on another computer if possible, or copy everything you have on the terminal so that I try to debug.

Regards Mohamad

DcShepherd commented 1 year ago

Hello @MohamadHarastani thanks for your help, I tried to install continuousflex using the steps you mentioned but I got the same error.

The error log is >1000 lines so I will paste part of it:

Start of error log from running scipion3 installp -p . --devel

bpl-subset/bpl_subset/boost/function/function_base.hpp: In instantiation of ‘static void pycudaboost::detail::function::functor_manager_common::manage_small(const pycudaboost::detail::function::function_buffer&, pycudaboost::detail::function::function_buffer&, pycudaboost::detail::function::functor_manager_operation_type) [with Functor = pycudaboost::_bi::bind_t<bool, pycudaboost::python::detail::translate_exception<pycuda::error, void ()(const pycuda::error&)>, pycudaboost::_bi::list3<pycudaboost::arg<1>, pycudaboost::arg<2>, pycudaboost::_bi::value<void ()(const pycuda::error&)> > >]’: bpl-subset/bpl_subset/boost/function/function_base.hpp:364:56: required from ‘static void pycudaboost::detail::function::functor_manager::manager(const pycudaboost::detail::function::function_buffer&, pycudaboost::detail::function::function_buffer&, pycudaboost::detail::function::functor_manager_operationtype, mpl::true_) [with Functor = pycudaboost::_bi::bind_t<bool, pycudaboost::python::detail::translate_exception<pycuda::error, void ()(const pycuda::error&)>, pycudaboost::_bi::list3<pycudaboost::arg<1>, pycudaboost::arg<2>, pycudaboost::_bi::value<void ()(const pycuda::error&)> > >; mpl::true = mpl::bool]’ bpl-subset/bpl_subset/boost/function/function_base.hpp:412:18: required from ‘static void pycudaboost::detail::function::functor_manager::manager(const pycudaboost::detail::function::function_buffer&, pycudaboost::detail::function::function_buffer&, pycudaboost::detail::function::functor_manager_operation_type, pycudaboost::detail::function::function_obj_tag) [with Functor = pycudaboost::_bi::bind_t<bool, pycudaboost::python::detail::translate_exception<pycuda::error, void ()(const pycuda::error&)>, pycudaboost::_bi::list3<pycudaboost::arg<1>, pycudaboost::arg<2>, pycudaboost::_bi::value<void ()(const pycuda::error&)> > >]’ bpl-subset/bpl_subset/boost/function/function_base.hpp:440:20: required from ‘static void pycudaboost::detail::function::functor_manager::manage(const pycudaboost::detail::function::function_buffer&, pycudaboost::detail::function::function_buffer&, pycudaboost::detail::function::functor_manager_operation_type) [with Functor = pycudaboost::_bi::bind_t<bool, pycudaboost::python::detail::translate_exception<pycuda::error, void ()(const pycuda::error&)>, pycudaboost::_bi::list3<pycudaboost::arg<1>, pycudaboost::arg<2>, pycudaboost::_bi::value<void ()(const pycuda::error&)> > >]’ bpl-subset/bpl_subset/boost/function/function_template.hpp:934:13: required from ‘void pycudaboost::function2<R, T1, T2>::assign_to(Functor) [with Functor = pycudaboost::_bi::bind_t<bool, pycudaboost::python::detail::translate_exception<pycuda::error, void ()(const pycuda::error&)>, pycudaboost::_bi::list3<pycudaboost::arg<1>, pycudaboost::arg<2>, pycudaboost::_bi::value<void ()(const pycuda::error&)> > >; R = bool; T0 = const pycudaboost::python::detail::exception_handler&; T1 = const pycudaboost::function0&]’ bpl-subset/bpl_subset/boost/function/function_template.hpp:722:22: required from ‘pycudaboost::function2<R, T1, T2>::function2(Functor, typename pycudaboost::enable_if_c<pycudaboost::type_traits::ice_not<pycudaboost::is_integral::value>::value, int>::type) [with Functor = pycudaboost::_bi::bind_t<bool, pycudaboost::python::detail::translate_exception<pycuda::error, void ()(const pycuda::error&)>, pycudaboost::_bi::list3<pycudaboost::arg<1>, pycudaboost::arg<2>, pycudaboost::_bi::value<void ()(const pycuda::error&)> > >; R = bool; T0 = const pycudaboost::python::detail::exception_handler&; T1 = const pycudaboost::function0&; typename pycudaboost::enable_if_c<pycudaboost::type_traits::ice_not<pycudaboost::is_integral::value>::value, int>::type = int]’ bpl-subset/bpl_subset/boost/python/exception_translator.hpp:20:39: required from ‘void pycudaboost::python::register_exception_translator(Translate, pycudaboost::type) [with ExceptionType = pycuda::error; Translate = void ()(const pycuda::error&)]’ src/wrapper/wrap_cudadrv.cpp:691:74: required from here bpl-subset/bpl_subset/boost/function/function_base.hpp:318:18: warning: placement new constructing an object of type ‘pycudaboost::detail::function::functor_manager_common<pycudaboost::_bi::bind_t<bool, pycudaboost::python::detail::translate_exception<pycuda::error, void ()(const pycuda::error&)>, pycudaboost::_bi::list3<pycudaboost::arg<1>, pycudaboost::arg<2>, pycudaboost::_bi::value<void ()(const pycuda::error&)> > > >::functor_type’ {aka ‘pycudaboost::_bi::bind_t<bool, pycudaboost::python::detail::translate_exception<pycuda::error, void ()(const pycuda::error&)>, pycudaboost::_bi::list3<pycudaboost::arg<1>, pycudaboost::arg<2>, pycudaboost::_bi::value<void ()(const pycuda::error&)> > >’} and size ‘16’ in a region of type ‘char’ and size ‘1’ [-Wplacement-new=] 318 | new (reinterpret_cast<void>(&out_buffer.data)) functor_type(in_functor); | ^~~~~~~~~ error: command 'gcc' failed with exit status 1 [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. error: legacy-install-failure

× Encountered error while trying to install package. ╰─> pycuda

note: This is an issue with the package mentioned above, not pip. hint: See above for output from the failure. | failed

CondaEnvException: Pip failed

END of error log from running scipion3 installp -p . --devel

There was a similar looking error on the Cryosparc forums . To see what would happen I added the cuda-nvcc=11.7/cuda-toolkit=11.7 dependencies with the appropriate channel to the conda.yaml. This worked and pycuda installed without much fuss.

However, the terminal log said it "Failed to build farneback3d" but also said that it ran the setup and installed see below:

First error after editing the conda.yaml file

Failed to build farneback3d Installing collected packages: tensorboard-plugin-wit, pytz, pyasn1, emtable, appdirs, zipp, urllib3, typing-extensions, tqdm, threadpoolctl, tensorboard-data-server, six, setuptools, rsa, pyparsing, pyasn1-modules, psutil, protobuf, platformdirs, pillow, packaging, oauthlib, numpy, mpi4py, MarkupSafe, llvmlite, kiwisolver, joblib, idna, grpcio, future, fonttools, decorator, cycler, configparser, charset-normalizer, certifi, cachetools, absl-py, werkzeug, torch, tkcolorpicker, tifffile, scipy, requests, pytools, python-dateutil, mrcfile, mako, importlib-metadata, google-auth, biopython, bibtexparser, torchvision, scikit-learn, requests-oauthlib, pycuda, pandas, numba, matplotlib, markdown, starfile, scipion-pyworkflow, pynndescent, google-auth-oauthlib, farneback3d, umap-learn, tensorboard, scipion-em Attempting uninstall: setuptools Found existing installation: setuptools 67.4.0 Uninstalling setuptools-67.4.0: Successfully uninstalled setuptools-67.4.0 Running setup.py install for farneback3d: started Running setup.py install for farneback3d: finished with status 'done'

End of first error after editing the conda.yaml file

During the installation of MD-NMMD-Genesis I get the following errors:

Start of second error after editing the conda.yaml file

Making install in src make[1]: Entering directory '/home/doulin/software/em/MD-NMMD-Genesis-1.1/src' Making install in lib make[2]: Entering directory '/home/doulin/software/em/MD-NMMD-Genesis-1.1/src/lib' cpp -traditional-cpp -traditional -DHAVE_CONFIG_H constants.fpp constants.f90 mpif90 -I. -I../../src -I. -O3 -ffast-math -march=native -ffree-line-length-none -fopenmp -c constants.f90 cpp -traditional-cpp -traditional -DHAVE_CONFIG_H mpi_parallel.fpp mpi_parallel.f90 mpif90 -I. -I../../src -I. -O3 -ffast-math -march=native -ffree-line-length-none -fopenmp -c mpi_parallel.f90 cpp -traditional-cpp -traditional -DHAVE_CONFIG_H messages.fpp messages.f90 mpif90 -I. -I../../src -I. -O3 -ffast-math -march=native -ffree-line-length-none -fopenmp -c messages.f90 cpp -traditional-cpp -traditional -DHAVE_CONFIG_H random.fpp random.f90 mpif90 -I. -I../../src -I. -O3 -ffast-math -march=native -ffree-line-length-none -fopenmp -c random.f90 cpp -traditional-cpp -traditional -DHAVE_CONFIG_H atom_libs.fpp atom_libs.f90 mpif90 -I. -I../../src -I. -O3 -ffast-math -march=native -ffree-line-length-none -fopenmp -c atom_libs.f90 cpp -traditional-cpp -traditional -DHAVE_CONFIG_H math_libs.fpp math_libs.f90 mpif90 -I. -I../../src -I. -O3 -ffast-math -march=native -ffree-line-length-none -fopenmp -c math_libs.f90 cpp -traditional-cpp -traditional -DHAVE_CONFIG_H string.fpp string.f90 mpif90 -I. -I../../src -I. -O3 -ffast-math -march=native -ffree-line-length-none -fopenmp -c string.f90 cpp -traditional-cpp -traditional -DHAVE_CONFIG_H table_libs.fpp table_libs.f90 mpif90 -I. -I../../src -I. -O3 -ffast-math -march=native -ffree-line-length-none -fopenmp -c table_libs.f90 cpp -traditional-cpp -traditional -DHAVE_CONFIG_H timers.fpp timers.f90 mpif90 -I. -I../../src -I. -O3 -ffast-math -march=native -ffree-line-length-none -fopenmp -c timers.f90 cpp -traditional-cpp -traditional -DHAVE_CONFIG_H nbond_list.fpp nbond_list.f90 mpif90 -I. -I../../src -I. -O3 -ffast-math -march=native -ffree-line-length-none -fopenmp -c nbond_list.f90 cpp -traditional-cpp -traditional -DHAVE_CONFIG_H ffte_fft235.fpp ffte_fft235.f90 mpif90 -I. -I../../src -I. -O3 -ffast-math -march=native -ffree-line-length-none -fopenmp -c ffte_fft235.f90 ffte_fft235.fpp:195:30:

195 | CALL SETTBL0(W(J),8,L) | 1 Error: Type mismatch in argument ‘w’ at (1); passed COMPLEX(8) to REAL(8) ffte_fft235.fpp:200:30:

200 | CALL SETTBL0(W(J),5,L) | 1 Error: Type mismatch in argument ‘w’ at (1); passed COMPLEX(8) to REAL(8) ffte_fft235.fpp:205:30:

205 | CALL SETTBL0(W(J),4,L) | 1 Error: Type mismatch in argument ‘w’ at (1); passed COMPLEX(8) to REAL(8) ffte_fft235.fpp:210:30:

210 | CALL SETTBL0(W(J),3,L) | 1 Error: Type mismatch in argument ‘w’ at (1); passed COMPLEX(8) to REAL(8) make[2]: [Makefile:714: ffte_fft235.o] Error 1 make[2]: Leaving directory '/home/doulin/software/em/MD-NMMD-Genesis-1.1/src/lib' make[1]: [Makefile:359: install-recursive] Error 1 make[1]: Leaving directory '/home/doulin/software/em/MD-NMMD-Genesis-1.1/src' make: *** [Makefile:355: install-recursive] Error 1 target '/home/doulin/software/em/MD-NMMD-Genesis-1.1/bin/atdyn' not built (after running 'git clone -b merge_genesis_1.4 https://github.com/continuousflex-org/MD-NMMD-Genesis.git . ; autoreconf -fi ; ./configure LDFLAGS=-L"/home/doulin/software/em/ContinuousFlex-3.3.14/lib" FFLAGS="-fallow-argument-mismatch -ffree-line-length-none"; make install;')

End of second error after editing the conda.yaml file

I decided to run the test anyway scipion3 tests --grep tomoflow --run but I get ther error shown below. I get this same error if I maually install pycuda and farneback3d

Logging configured. STDOUT --> /home/doulin/ScipionUserData/projects/TestTomoFlow/Runs/000207_FlexProtDimredPdb/logs/run.stdout , STDERR --> /home/doulin/ScipionUserData/projects/TestTomoFlow/Runs/000207_FlexProtDimredPdb/logs/run.stderr RUNNING PROTOCOL ----------------- Protocol starts Hostname: doulin-ghosal-lab PID: 387364 pyworkflow: 3.0.29 plugin: continuousflex plugin v: 3.3.14 currentDir: /home/doulin/ScipionUserData/projects/TestTomoFlow workingDir: Runs/000207_FlexProtDimredPdb runMode: Continue MPI: 1 threads: 1 Starting at step: 1 Running steps STARTED: readInputFiles, step 1, time 2023-02-27 11:27:14.859553

Reading pdb file Runs/000088_FlexProtSynthesizeSubtomo/extra/00001_df.pdb ... Warning : Can not read PDB file Runs/000088_FlexProtSynthesizeSubtomo/extra/00001_df.pdb Reading pdb file Runs/000088_FlexProtSynthesizeSubtomo/extra/00002_df.pdb ... Warning : Can not read PDB file Runs/000088_FlexProtSynthesizeSubtomo/extra/00002_df.pdb Reading pdb file Runs/000088_FlexProtSynthesizeSubtomo/extra/00003_df.pdb ... Warning : Can not read PDB file Runs/000088_FlexProtSynthesizeSubtomo/extra/00003_df.pdb Reading pdb file Runs/000088_FlexProtSynthesizeSubtomo/extra/00004_df.pdb ... Warning : Can not read PDB file Runs/000088_FlexProtSynthesizeSubtomo/extra/00004_df.pdb Reading pdb file Runs/000088_FlexProtSynthesizeSubtomo/extra/00005_df.pdb ... Warning : Can not read PDB file Runs/000088_FlexProtSynthesizeSubtomo/extra/00005_df.pdb Reading pdb file Runs/000088_FlexProtSynthesizeSubtomo/extra/00006_df.pdb ... Warning : Can not read PDB file Runs/000088_FlexProtSynthesizeSubtomo/extra/00006_df.pdb Wrinting dcd file Runs/000207_FlexProtDimredPdb/extra/coords.dcd FAILED: readInputFiles, step 1, time 2023-02-27 11:27:14.872191 *** Last status is failed ------------------- PROTOCOL FAILED (DONE 1/3)

* end of STD OUT ***

* last 50 lines of STD ERR ***

/home/doulin/clones/scipion-em-continuousflex/continuousflex/protocols/utilities/pdb_handler.py:64: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison atomNum[np.where(atomNum == "****")[0]] = "-1" Traceback (most recent call last): File "/home/doulin/miniconda3/envs/scipion3/lib/python3.8/site-packages/pyworkflow/protocol/protocol.py", line 201, in run self._run() File "/home/doulin/miniconda3/envs/scipion3/lib/python3.8/site-packages/pyworkflow/protocol/protocol.py", line 252, in _run resultFiles = self._runFunc() File "/home/doulin/miniconda3/envs/scipion3/lib/python3.8/site-packages/pyworkflow/protocol/protocol.py", line 248, in _runFunc return self._func(self._args) File "/home/doulin/clones/scipion-em-continuousflex/continuousflex/protocols/protocol_pdb_dimred.py", line 152, in readInputFiles numpyArr2dcd(pdbs_arr, self._getExtraPath("coords.dcd")) File "/home/doulin/clones/scipion-em-continuousflex/continuousflex/protocols/utilities/genesisutilities.py", line 189, in numpyArr2dcd nframe, natom, = arr.shape ValueError: not enough values to unpack (expected 3, got 1) Protocol failed: not enough values to unpack (expected 3, got 1)

* end of STD ERR ***

[ FAILED ] TestTomoFlow.test_all

Traceback (most recent call last): File "/home/doulin/miniconda3/envs/scipion3/lib/python3.8/unittest/case.py", line 60, in testPartExecutor yield File "/home/doulin/miniconda3/envs/scipion3/lib/python3.8/unittest/case.py", line 676, in run self._callTestMethod(testMethod) File "/home/doulin/miniconda3/envs/scipion3/lib/python3.8/unittest/case.py", line 633, in _callTestMethod method() File "/home/doulin/clones/scipion-em-continuousflex/continuousflex/tests/test_workflow_TomoFlow.py", line 78, in test_all self.launchProtocol(protpdbdimred) File "/home/doulin/miniconda3/envs/scipion3/lib/python3.8/site-packages/pyworkflow/tests/tests.py", line 116, in launchProtocol raise Exception("Protocol %s execution failed. See last log lines above for more details." % prot.getRunName()) Exception: Protocol PCA on groundtruth PDBs execution failed. See last log lines above for more details.

[==========] run 1 tests (31.568 secs)

[ FAILED ] 1 tests

[ PASSED ] 0 tests `

MohamadHarastani commented 1 year ago

Thanks for the details. Since you managed to install it, let's try first to fix the test problem. I got the same problem before, due to a wrong introduction of something in Xmipp (see this issue). Quick fixes to continue the test of tomoflow are: 1- Easy: remove the flag --centerPDB from this line 2- Harder, but general: compile Xmipp devel branch.

Since I fear that we are at different time zones, if the code of TomoFlow fails, I propose that you match the Cuda and Gcc versions to the ones recommended by Scipion (see documentation here), those are Cuda 11.4 and gcc-10. The installation of Genesis is tricky, but since you will not use NMMD or MDSPACE, you can deactivate it by setting the default to False in this line.

I will do my best to guide you through the usage of TomoFlow, so please feel free to continue posing your questions here.

Regards Mohamad

MohamadHarastani commented 1 year ago

@mms29 matching mpi, gcc and gfortran versions seems quite tricky. Could you add to the documentation something about solving this potential problem?

DcShepherd commented 1 year ago

Removing the --centerPDB flagged worked so I will mark this issue as solved. I will run TomoFlow with my data later this week to see what results I can get.