rapidsai / crossfit

Metric calculation library
Apache License 2.0
2 stars 5 forks source link

[BUG] numpy casting bug #56

Closed VibhuJawa closed 1 month ago

VibhuJawa commented 2 months ago

NEP-50 introduces a minor breaking bug int the repo. We should fix it asap.

068da6e953f6d786f578cc6e1bb3e', 'read_single_partition-2553c36722db2b509eeeb40e1b326eb0', ['/datasets/prospector-lm/cleaned_exact_dedup_all_cc/crawl-data-CC-MAIN-2020-29-segments-1593655878519.27-warc-CC-MAIN-20200702045758-20200702075758-00007.jsonl'])
kwargs:    {}
Exception: "TypeError('can_cast() does not support Python ints, floats, and complex because the result used to depend on the value.\\nThis change was part of adopting NEP 50, we may explicitly allow them again in the future.')"

2024-06-26 22:29:04,798 - distributed.worker - WARNING - Compute Failed
Key:       ('single_partition_write_with_filename-a71769790451278106c3035b49dcd992', 6)
Function:  subgraph_callable-5dfffd8e40bfb0861d4d7becca088674
args:      ('/raid/vjawa/prospector-lm/embeddings_crossfit_fb_c4_10', '<crossfit.backend.torch.op.base.Predictor object a-5c9be063012665285e84a5328d21840a', {'number': 6, 'division': None}, '<crossfit.op.tokenize.Tokenizer object at 0x7f04e9-d26068da6e953f6d786f578cc6e1bb3e', 'read_single_partition-2553c36722db2b509eeeb40e1b326eb0', ['/datasets/prospector-lm/cleaned_exact_dedup_all_cc/crawl-data-CC-MAIN-2020-29-segments-1593655878519.27-warc-CC-MAIN-20200702045758-20200702075758-00006.jsonl'])
kwargs:    {}
Exception: "TypeError('can_cast() does not support Python ints, floats, and complex because the result used to depend on the value.\\nThis change was part of adopting NEP 50, we may explicitly allow them again in the future.')"
VibhuJawa commented 1 month ago

Fixed by https://github.com/rapidsai/crossfit/pull/60/files#