Traceback (most recent call last):
File "/gpfs/alpine/proj-shared/med110/hrlee/git/braceal/molecules/scripts/traj_to_dset.py", line 99, in <module>
main()
File "/gpfs/alpine/proj-shared/med110/conda/pytorch/lib/python3.6/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/gpfs/alpine/proj-shared/med110/conda/pytorch/lib/python3.6/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/gpfs/alpine/proj-shared/med110/conda/pytorch/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/gpfs/alpine/proj-shared/med110/conda/pytorch/lib/python3.6/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/gpfs/alpine/proj-shared/med110/hrlee/git/braceal/molecules/scripts/traj_to_dset.py", line 94, in main
sel=selection, cm_format=cm_format, num_workers=num_workers, comm=mpi_comm, verbose=verbose)
File "/gpfs/alpine/med110/proj-shared/hrlee/git/braceal/molecules/molecules/sim/dataset.py", line 547, in traj_to_dset
rows_ = comm.gather(rows_, 0)
File "mpi4py/MPI/Comm.pyx", line 1262, in mpi4py.MPI.Comm.gather
File "mpi4py/MPI/msgpickle.pxi", line 680, in mpi4py.MPI.PyMPI_gather
File "mpi4py/MPI/msgpickle.pxi", line 685, in mpi4py.MPI.PyMPI_gather
File "mpi4py/MPI/msgpickle.pxi", line 148, in mpi4py.MPI.Pickle.allocv
File "mpi4py/MPI/msgpickle.pxi", line 139, in mpi4py.MPI.Pickle.alloc
SystemError: Negative size passed to PyBytes_FromStringAndSize
I tried to add an exception handler to line 547, and set 0 to rows, cols to ignore when it's corrupted but it doesn't seem a correct patch. I will dig further but wanted to report this first.
This happened when I try to aggregate 240 dcd files across 40 Summit nodes:
and the error:
I tried to add an exception handler to line 547, and set 0 to rows, cols to ignore when it's corrupted but it doesn't seem a correct patch. I will dig further but wanted to report this first.