rapidsai / crossfit

Metric calculation library
Apache License 2.0
2 stars 5 forks source link

Added ctranslate2 translation example script #83

Open uahmed93 opened 6 days ago

uahmed93 commented 6 days ago

Added a ctransalte2 example which works on string tokens instead of integer tokens.

To run the example :

python3 example/custom_ct2_model.py --ct2-model-dir <your-modelp-dir> inp.parquet out.parquet
copy-pr-bot[bot] commented 6 days ago

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

uahmed93 commented 6 days ago

This script is breaking because it is unable to serialize ct2 model while loading on worker. I have run this after initializing NDC gpu dask cluster on slurm but this is not working here.

VibhuJawa commented 6 days ago

@uahmed93 , Can you post the error you saw here please.

uahmed93 commented 6 days ago

Here:

Deployed LocalCUDACluster(22f3d4fd, 'tcp://127.0.0.1:37693', workers=1, threads=1, memory=1.79 TiB)...
2024-09-11 10:56:33,005 - distributed.protocol.pickle - ERROR - Failed to serialize <ToPickle: HighLevelGraph with 4 layers.
<dask.highlevelgraph.HighLevelGraph object at 0x15522cf81900>
 0. read-parquet-2f062a30cf4676cd0b9a4ab5cf06fe85
 1. repartition-2-aa356a5cb4dedecfbdd625f941566718
 2. to-parquet-03d4d012cafd9ed3bdb4ee7374761cf1
 3. store-to-parquet-03d4d012cafd9ed3bdb4ee7374761cf1
>.
Traceback (most recent call last):
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/pickle.py", line 63, in dumps
    result = pickle.dumps(x, **dump_kwargs)
AttributeError: Can't pickle local object 'to_parquet.<locals>.<lambda>'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/pickle.py", line 68, in dumps
    pickler.dump(x)
AttributeError: Can't pickle local object 'to_parquet.<locals>.<lambda>'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/pickle.py", line 81, in dumps
    result = cloudpickle.dumps(x, **dump_kwargs)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/cloudpickle/cloudpickle.py", line 1479, in dumps
    cp.dump(obj)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/cloudpickle/cloudpickle.py", line 1245, in dump
    return super().dump(obj)
TypeError: cannot pickle 'ctranslate2._ext.Translator' object
Traceback (most recent call last):
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/pickle.py", line 63, in dumps
    result = pickle.dumps(x, **dump_kwargs)
AttributeError: Can't pickle local object 'to_parquet.<locals>.<lambda>'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/pickle.py", line 68, in dumps
    pickler.dump(x)
AttributeError: Can't pickle local object 'to_parquet.<locals>.<lambda>'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/serialize.py", line 353, in serialize
    header, frames = dumps(x, context=context) if wants_context else dumps(x)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/serialize.py", line 76, in pickle_dumps
    frames[0] = pickle.dumps(
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/pickle.py", line 81, in dumps
    result = cloudpickle.dumps(x, **dump_kwargs)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/cloudpickle/cloudpickle.py", line 1479, in dumps
    cp.dump(obj)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/cloudpickle/cloudpickle.py", line 1245, in dump
    return super().dump(obj)
TypeError: cannot pickle 'ctranslate2._ext.Translator' object

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/ctransl/ctransl_cf.py", line 145, in <module>
    main()
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/ctransl/ctransl_cf.py", line 141, in main
    outputs.to_parquet(args.output_parquet_path)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/nvtx/nvtx.py", line 116, in inner
    result = func(*args, **kwargs)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/dask_cudf/core.py", line 264, in to_parquet
    return to_parquet(self, path, *args, **kwargs)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/dask/dataframe/io/parquet/core.py", line 1047, in to_parquet
    out = out.compute(**compute_kwargs)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/dask/base.py", line 379, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/dask/base.py", line 665, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/lustre/fsw/portfolios/llmservice/users/uahmed/test_env/lib/python3.10/site-packages/distributed/protocol/serialize.py", line 379, in serialize
    raise TypeError(msg, str_x) from exc
TypeError: ('Could not serialize object of type HighLevelGraph', '<ToPickle: HighLevelGraph with 4 layers.\n<dask.highlevelgraph.HighLevelGraph object at 0x15522cf81900>\n 0. read-parquet-2f062a30cf4676cd0b9a4ab5cf06fe85\n 1. repartition-2-aa356a5cb4dedecfbdd625f941566718\n 2. to-parquet-03d4d012cafd9ed3bdb4ee7374761cf1\n 3. store-to-parquet-03d4d012cafd9ed3bdb4ee7374761cf1\n>')
VibhuJawa commented 1 day ago

Also , CC: @sarahyurick for suggestions .