xlang-ai / instructor-embedding

[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
Apache License 2.0
1.78k stars 131 forks source link

Production Level Updates #93

Open Nicholas-Schaub opened 8 months ago

Nicholas-Schaub commented 8 months ago

If the maintainers of the repo are open to pull requests, would they be interested in a pull request for code cleanup and some production level updates? There are a lot of things in the code that just don't make sense, such as setting the max sequence length and then checking the max sequence length, or neurotically sending models to devices. This isn't great for a more production level deployment when we want to embed millions or billions of documents.

I am going to be making these changes, but I don't want to fork and diverge from the repo. I'd rather that my work be pulled into this repo so that I not only don't have to worry about downstream updates but also it will benefit anyone that uses this.

hongjin-su commented 8 months ago

Hi, Thanks a lot for your interest in the INSTRUCTOR!

I welcome all levels of code cleanup! Feel free to open the pull request!

ashokrajab commented 7 months ago

Raised one such PR https://github.com/xlang-ai/instructor-embedding/pull/92