running with a hosted model?

jparkerweb / semantic-chunking

🍱 semantic-chunking ⇢ semantically create chunks from large document for passing to LLM workflows

https://www.npmjs.com/package/semantic-chunking

MIT License

39 stars 2 forks source link

running with a hosted model? #5

Closed dcsan closed 2 months ago

dcsan commented 4 months ago

i'd like to include this technique in a larger project but I'm worried about the requirements downloading the model as I'm using a containerized environment.

what would be involved to port this code to use a hosted model eg via HF? I assume its not enough to have an embedding API?

alternatively could there be an installer that grabs the required files, to add to a docker container?

jparkerweb commented 4 months ago

I added a download-models tool that might help you get closer to your goal of pre-downloading/packaging the model files. https://github.com/jparkerweb/semantic-chunking?tab=readme-ov-file#-pre-downloading-models

davevilela commented 4 weeks ago

@jparkerweb hey, could I run this on runpod or cloudflare workers AI?

davevilela commented 3 weeks ago

I'm trying to use this library in a cloudflare worker, and getting this error:

Here's an overview of my setup:

Screenshot 2024-10-22 at 15 43 05

dcsan commented 3 weeks ago

not the author but would be surprised if it works in a worker - it needs to download a lot of model data first time its run.

davevilela commented 3 weeks ago

@dcsan Hey! What would you recommend to deploy this?

davevilela commented 3 weeks ago

@dcsan This error is occurring even after manually downloading the models locally. I have no clue on how to solve this.