rapidsai / gpu-bdb

RAPIDS GPU-BDB
Apache License 2.0
107 stars 44 forks source link

Handle result directory creation #221

Open randerzander opened 3 years ago

randerzander commented 3 years ago

Periodically we need to run GPU-BDB on a new system with local only storage.

It sometimes proves difficult getting the result directories setup correctly. For instance, w/ a config having the value: output_dir: /data/gpu-bdb/results/, and all worker nodes having that directory, q10 gives a failure:

Encountered Exception while running query                                                    
Traceback (most recent call last):      
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/bdb_tools/ut
ils.py", line 283, in run_dask_cudf_query                                                    
    benchmark(                             
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/bdb_tools/ut
ils.py", line 61, in benchmark                                                               
    result = func(*args, **kwargs)      
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/bdb_tools/ut
ils.py", line 115, in write_result                                                           
    write_etl_result( 
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/bdb_tools/ut
ils.py", line 147, in write_etl_result                                                       
    df.to_parquet(output_path, write_index=False)
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/dask_cudf/co
re.py", line 263, in to_parquet  
    return to_parquet(self, path, *args, **kwargs)
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/dask/datafra
me/io/parquet/core.py", line 653, in to_parquet
    out = out.compute(**compute_kwargs)
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/dask/base.py
", line 286, in compute      
    (result,) = compute(self, traverse=False, **kwargs)
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/dask/base.py
", line 568, in compute                                               
    results = schedule(dsk, keys, **kwargs)
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/distributed/
client.py", line 2704, in get
    results = self.gather(packed, asynchronous=asynchronous, direct=direct)
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/distributed/
client.py", line 2018, in gather
    return self.sync(          
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/distributed/
client.py", line 859, in sync
    return sync(
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/distributed/
utils.py", line 326, in sync
    raise exc.with_traceback(tb)
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/distributed/
utils.py", line 309, in f
    result[0] = yield future
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/tornado/gen.
py", line 762, in run
    value = future.result()
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/distributed/
client.py", line 1883, in _gather
    raise exception.with_traceback(traceback)
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/dask_cudf/io
/parquet.py", line 136, in write_partition
    with fs.open(fs.sep.join([path, filename]), mode="wb") as out_file:
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/fsspec/spec.
py", line 962, in open
    f = self._open(
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/fsspec/imple
mentations/local.py", line 144, in _open
    return LocalFileOpener(path, mode, fs=self, **kwargs)
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/fsspec/imple
mentations/local.py", line 235, in __init__
    self._open()
  File "/home/rladmin/miniconda3/envs/rapids-gpu-bdb/lib/python3.8/site-packages/fsspec/imple
mentations/local.py", line 240, in _open
    self.f = open(self.path, mode=self.mode)
FileNotFoundError: [Errno 2] No such file or directory: '/data/gpu-bdb/results/q10-results.pa
rquet/part.13.parquet'
randerzander commented 3 years ago

For future reference, if running on nodes w/ a local only filesystem, you'll need to make q10-q30 results directory available at output_dir on all nodes:

END=30
for i in $(seq 10 $END); do mkdir -p /data/gpu-bdb/results/q$i-results.parquet; done