BiomedSciAI / fuse-med-ml

A python framework accelerating ML based discovery in the medical field by encouraging code reuse. Batteries included :)
Apache License 2.0
134 stars 34 forks source link

Integrate `ModularTokenizerOp` with Hugging Face remote 🤗 #368

Closed SagiPolaczek closed 1 week ago

SagiPolaczek commented 1 week ago

Supporting HF-like API to integrate the modular_tokenizer_op with HF remote.

API:


from fuse.data.tokenizers.modular_tokenizer.op import ModularTokenizerOp

tokenizer_op = ModularTokenizerOp.from_pretrained("repo/id")
# OR 
tokenizer_op = ModularTokenizerOp.from_pretrained("/local/path/to/tokenizer/dir")