For example, beir_report.py works with nvcr.io/nvidia/pytorch:23.09-py3 and nvcr.io/nvidia/pytorch:23.10-py3 but it no longer works with nvcr.io/nvidia/pytorch:23.11-py3.
The error message with nvcr.io/nvidia/pytorch:23.11-py3.
2023-12-08 01:25:45,495 - distributed.protocol.pickle - ERROR - Failed to serialize <ToPickle: HighLevelGraph with 4 layers.
<dask.highlevelgraph.HighLevelGraph object at 0x7f2c902a19f0>
0. read-parquet-a4e9d892750bb1499f10a2b9f98520c2
1. repartition-2-70abc3715803fad94c0fca8e779ab753
2. to-parquet-0190cffb61c5a5875f54ece4c9c1999c
3. store-to-parquet-0190cffb61c5a5875f54ece4c9c1999c
>.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/distributed/protocol/pickle.py", line 63, in dumps
result = pickle.dumps(x, **dump_kwargs)
AttributeError: Can't pickle local object 'to_parquet.<locals>.<lambda>'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/distributed/protocol/pickle.py", line 68, in dumps
pickler.dump(x)
AttributeError: Can't pickle local object 'to_parquet.<locals>.<lambda>'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/distributed/protocol/pickle.py", line 81, in dumps
result = cloudpickle.dumps(x, **dump_kwargs)
File "/usr/local/lib/python3.10/dist-packages/cloudpickle/cloudpickle.py", line 1479, in dumps
cp.dump(obj)
File "/usr/local/lib/python3.10/dist-packages/cloudpickle/cloudpickle.py", line 1245, in dump
return super().dump(obj)
_pickle.PicklingError: Can't pickle <cyfunction ParquetFragmentScanOptions._reconstruct at 0x7f2d3022e9b0>: it's not the same object as pyarrow._dataset_parquet.ParquetFragmentScanOptions._reconstruct
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/distributed/protocol/pickle.py", line 63, in dumps
result = pickle.dumps(x, **dump_kwargs)
AttributeError: Can't pickle local object 'to_parquet.<locals>.<lambda>'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/distributed/protocol/pickle.py", line 68, in dumps
pickler.dump(x)
AttributeError: Can't pickle local object 'to_parquet.<locals>.<lambda>'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/distributed/protocol/serialize.py", line 352, in serialize
header, frames = dumps(x, context=context) if wants_context else dumps(x)
File "/usr/local/lib/python3.10/dist-packages/distributed/protocol/serialize.py", line 75, in pickle_dumps
frames[0] = pickle.dumps(
File "/usr/local/lib/python3.10/dist-packages/distributed/protocol/pickle.py", line 81, in dumps
result = cloudpickle.dumps(x, **dump_kwargs)
File "/usr/local/lib/python3.10/dist-packages/cloudpickle/cloudpickle.py", line 1479, in dumps
cp.dump(obj)
File "/usr/local/lib/python3.10/dist-packages/cloudpickle/cloudpickle.py", line 1245, in dump
return super().dump(obj)
_pickle.PicklingError: Can't pickle <cyfunction ParquetFragmentScanOptions._reconstruct at 0x7f2d3022e9b0>: it's not the same object as pyarrow._dataset_parquet.ParquetFragmentScanOptions._reconstruct
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/workspace/beir_report.py", line 48, in <module>
main()
File "/workspace/beir_report.py", line 35, in main
report = cf.beir_report(
File "/workspace/crossfit/report/beir/report.py", line 185, in beir_report
embeddings: EmbeddingDatataset = embed(
File "/workspace/crossfit/report/beir/embed.py", line 78, in embed
embeddings.to_parquet(os.path.join(emb_dir, dtype))
File "/usr/local/lib/python3.10/dist-packages/nvtx/nvtx.py", line 101, in inner
result = func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/dask_cudf/core.py", line 252, in to_parquet
return to_parquet(self, path, *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/dask/dataframe/io/parquet/core.py", line 1062, in to_parquet
out = out.compute(**compute_kwargs)
File "/usr/local/lib/python3.10/dist-packages/dask/base.py", line 342, in compute
(result,) = compute(self, traverse=False, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/dask/base.py", line 628, in compute
results = schedule(dsk, keys, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/distributed/protocol/serialize.py", line 374, in serialize
raise TypeError(msg, str(x)[:10000]) from exc
TypeError: ('Could not serialize object of type HighLevelGraph', '<ToPickle: HighLevelGraph with 4 layers.\n<dask.highlevelgraph.HighLevelGraph object at 0x7f2c902a19f0>\n 0. read-parquet-a4e9d892750bb1499f10a2b9f98520c2\n 1. repartition-2-70abc3715803fad94c0fca8e779ab753\n 2. to-parquet-0190cffb61c5a5875f54ece4c9c1999c\n 3. store-to-parquet-0190cffb61c5a5875f54ece4c9c1999c\n>')
For example, beir_report.py works with
nvcr.io/nvidia/pytorch:23.09-py3
andnvcr.io/nvidia/pytorch:23.10-py3
but it no longer works withnvcr.io/nvidia/pytorch:23.11-py3
.The error message with
nvcr.io/nvidia/pytorch:23.11-py3
.