Open seismolab-uct opened 1 day ago
As I suspected, after resetting all my stack jobs and all my cross-correlation jobs and changing the setting to "keep all = Y" my msnoise -t 8 cc compute_cc
exited with error:
2024-10-30 23:16:15.591004 msnoise [pid 393605][INFO]: Finished preprocessing
2024-10-30 23:16:22.451574 msnoise [pid 330386][INFO]: Received preprocessed traces
Process Process-2:
Traceback (most recent call last):
File "/home/seismolab/Software/anaconda3/envs/msnoise/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/home/seismolab/Software/anaconda3/envs/msnoise/lib/python3.12/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/seismolab/Software/anaconda3/envs/msnoise/lib/python3.12/site-packages/msnoise/s03compute_no_rotation.py", line 620, in main
export_allcorr2(db, ccfid, allcorr[ccfid])
File "/home/seismolab/Software/anaconda3/envs/msnoise/lib/python3.12/site-packages/msnoise/api.py", line 1151, in export_allcorr2
df.to_hdf(os.path.join(path, date+'.h5'), key='data')
File "/home/seismolab/Software/anaconda3/envs/msnoise/lib/python3.12/site-packages/pandas/util/_decorators.py", line 333, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/seismolab/Software/anaconda3/envs/msnoise/lib/python3.12/site-packages/pandas/core/generic.py", line 2855, in to_hdf
pytables.to_hdf(
File "/home/seismolab/Software/anaconda3/envs/msnoise/lib/python3.12/site-packages/pandas/io/pytables.py", line 311, in to_hdf
f(store)
File "/home/seismolab/Software/anaconda3/envs/msnoise/lib/python3.12/site-packages/pandas/io/pytables.py", line 293, in <lambda>
f = lambda store: store.put(
^^^^^^^^^^
File "/home/seismolab/Software/anaconda3/envs/msnoise/lib/python3.12/site-packages/pandas/io/pytables.py", line 1160, in put
self._write_to_group(
File "/home/seismolab/Software/anaconda3/envs/msnoise/lib/python3.12/site-packages/pandas/io/pytables.py", line 1858, in _write_to_group
s.write(
File "/home/seismolab/Software/anaconda3/envs/msnoise/lib/python3.12/site-packages/pandas/io/pytables.py", line 3333, in write
self.write_array(f"block{i}_values", blk.values, items=blk_items)
File "/home/seismolab/Software/anaconda3/envs/msnoise/lib/python3.12/site-packages/pandas/io/pytables.py", line 3198, in write_array
self._handle.create_array(self.group, key, value)
File "/home/seismolab/Software/anaconda3/envs/msnoise/lib/python3.12/site-packages/tables/file.py", line 1142, in create_array
return Array(parentnode, name,
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/seismolab/Software/anaconda3/envs/msnoise/lib/python3.12/site-packages/tables/array.py", line 186, in __init__
super().__init__(parentnode, name, new, Filters(), byteorder, _log,
File "/home/seismolab/Software/anaconda3/envs/msnoise/lib/python3.12/site-packages/tables/leaf.py", line 350, in __init__
super().__init__(parentnode, name, _log)
File "/home/seismolab/Software/anaconda3/envs/msnoise/lib/python3.12/site-packages/tables/node.py", line 256, in __init__
self._v_objectid = self._g_create()
^^^^^^^^^^^^^^^^
File "/home/seismolab/Software/anaconda3/envs/msnoise/lib/python3.12/site-packages/tables/array.py", line 218, in _g_create
(self._v_objectid, self.shape, self.atom) = self._create_array(
^^^^^^^^^^^^^^^^^^^
File "tables/hdf5extension.pyx", line 1416, in tables.hdf5extension.Array._create_array
tables.exceptions.HDF5ExtError: Problems creating the Array.
I ran out of space in the disk, the Cross_Correlations directory (in my case named Xcorrs) is over 740 GB and there are still a good number of cross_correlation jobs to compute (541669 CC jobs in the database: 23841 todo, 126443 in progress and 391385 done).
Is it possible to keep the Data Folder on another disk or is the database expecting it to be under the same project folder were the db.ini file is?
Hi, if you don't plan to use subdaily stacks, it'd be possible to use what @LaureBrenot prepared here: https://github.com/ROBelgium/MSNoise/pull/363
This is: not expecting the keep_all=Y, and build the ref & mov stacks from the DAY stacks
re: moving the stuff:, it's possible to move the whole project at once to another disk (or since you're on linux: move the CROSS_CORRELATION directory elsewhere & make a symbolic link to it)
Ok thanks Thomas, I'll try moving it first and creating a symbolic link.
Hi, I am looking into generating a reference stack with
msnoise -t 8 cc stack -r
, and the error I get has been reproduced in issue #339, which is caused by the setting "keep all = N". I was not aware that "keep all = Y" is compulsory in the dev version, and my original thinking was that I will save some disk space by using "keep all = N".I think at the moment my only option is to start over with the cross-correlations by resetting all stacks and all cross-correlation jobs and setting "keep all = Y". Could someone let me know if indeed that is my only option? My dataset is relatively large (3TB) and my STACKS folder is already 23GB, I am guessing the CROSS_CORRELATIONS directory will be much larger than 23 GB and makes me worry about disk space (3.7TB).
I would appreciate some guidance, thanks.