AllenInstitute / datacube

Other
0 stars 1 forks source link

possible errors to start human mtg and/or generate human mtg data #114

Open shus2018 opened 5 years ago

shus2018 commented 5 years ago

possible errors to start human mtg or generate human mtg data. I did experience similar issues last week, worked around manually, today, it happened again. Nice to look into to avoid potential issues in the future.

workaround: clean human mtg directory and re-generate data a few times. :-)

  1. auto-deployment failed to start human mtg: loading '/dev/shm/human_mtg_transcriptomics.zarr.lmdb' zarr LMDBstore as xarray dataset... 2018-10-21T15:56:04-0700 [Guest 19647] 2018-10-21T15:56:04-0700 Traceback (most recent call last): 2018-10-21T15:56:04-0700 [Guest 19647] 2018-10-21T15:56:04-0700 File "server.py", line 367, in 2018-10-21T15:56:04-0700 [Guest 19647] 2018-10-21T15:56:04-0700 **options) 2018-10-21T15:56:04-0700 [Guest 19647] 2018-10-21T15:56:04-0700 File "/local1/apps/datacube-builds/DataCube--237/services/datacube/datacube/datacube.py", line 258, in init 2018-10-21T15:56:04-0700 [Guest 19647] 2018-10-21T15:56:04-0700 self.load(path, chunks, missing_data, calculate_stats, persist) 2018-10-21T15:56:04-0700 [Guest 19647] 2018-10-21T15:56:04-0700 File "/local1/apps/datacube-builds/DataCube--237/services/datacube/datacube/datacube.py", line 323, in load 2018-10-21T15:56:04-0700 [Guest 19647] 2018-10-21T15:56:04-0700 self.load_zarr_lmdb(path, chunks, exclude=persist) 2018-10-21T15:56:04-0700 [Guest 19647] 2018-10-21T15:56:04-0700 File "/local1/apps/datacube-builds/DataCube--237/services/datacube/datacube/datacube.py", line 303, in load_zarr_lmdb 2018-10-21T15:56:04-0700 [Guest 19647] 2018-10-21T15:56:04-0700 self.df = self.df.chunk(chunks) 2018-10-21T15:56:04-0700 [Guest 19647] 2018-10-21T15:56:04-0700 File "/local1/miniconda3/envs/datacube/lib/python3.6/site-packages/xarray/core/dataset.py", line 1285, in chunk 2018-10-21T15:56:04-0700 [Guest 19647] 2018-10-21T15:56:04-0700 'object: %s' % bad_dims) 2018-10-21T15:56:04-0700 [Guest 19647] 2018-10-21T15:56:04-0700 ValueError: some chunks keys are not dimensions on this object: ['gene'] 2018-10-21T15:56:05-0700 [Guest 19647] Service crashed with exit code 1. Respawning... 2018-10-21T15:56:09-0700 [Guest 19647] 2018-10-21T15:56:09-0700 deleting '/dev/shm/human_mtg_transcriptomics.zarr.lmdb'... 2018-10-21T15:56:09-0700 [Guest 19647] 2018-10-21T15:56:09-0700 cloning '../.././human_mtg_data/human_mtg_transcriptomics.zarr.lmdb' store to '/dev/shm/human_mtg_transcriptomics.zarr.lmdb'... 2018-10-21T15:56:09-0700 [Guest 19647] 2018-10-21T15:56:09-0700 loading '/dev/shm/human_mtg_transcriptomics.zarr.lmdb' zarr LMDBstore as xarray dataset... 2018-10-21T15:56:09-0700 [Guest 19647] 2018-10-21T15:56:09-0700 Traceback (most recent call last):

  2. failed to generate human data:

ocal1/apps/datacube/scripts/datasets/human_mtg_transcriptomics.py:87: UserWarning: DataFrame columns are not unique, some columns will be omitted. mouse_genes = pd.DataFrame(mouse_genes).set_index('acronym').T.to_dict() INFO:root:writing dataset to ./human_mtg_data/human_mtg_transcriptomics.zarr.lmdb Traceback (most recent call last): File "/local1/apps/datacube/scripts/datasets/human_mtg_transcriptomics.py", line 109, in main() File "/local1/apps/datacube/scripts/datasets/human_mtg_transcriptomics.py", line 93, in main ds.to_zarr(store=store) File "/local1/miniconda3/envs/datacube/lib/python3.6/site-packages/xarray/core/dataset.py", line 1187, in to_zarr group=group, encoding=encoding, compute=compute) File "/local1/miniconda3/envs/datacube/lib/python3.6/site-packages/xarray/backends/api.py", line 854, in to_zarr group=group, writer=None) File "/local1/miniconda3/envs/datacube/lib/python3.6/site-packages/xarray/backends/zarr.py", line 240, in open_group synchronizer=synchronizer, path=group) File "/local1/miniconda3/envs/datacube/lib/python3.6/site-packages/zarr/hierarchy.py", line 1128, in open_group err_contains_group(path) File "/local1/miniconda3/envs/datacube/lib/python3.6/site-packages/zarr/errors.py", line 17, in err_contains_group raise ValueError('path %r contains a group' % path) ValueError: path '' contains a group

chrisbarber commented 5 years ago

Usually ValueError: path '' contains a group happens when the directory isn't clean (the *.lmdb already exists). If it is happening with a clean directory though that would be strange.

shus2018 commented 5 years ago

yes, my workaround: clean human mtg directory and re-generate data a few times. :-)

chrisbarber commented 5 years ago

If you get ValueError: some chunks keys are not dimensions on this object: ['gene'] again it might be helpful for me if you could save the generated files and I could look at them.

shus2018 commented 5 years ago

will do, thanks. :-)