Closed jhyearsley closed 10 months ago
I agree that internals need a more explicit standard, especially around adding support for new models.
LMs can be modeled after this class, though probably they can be shorter: https://github.com/stanfordnlp/dspy/blob/main/dsp/modules/gpt3.py
For chunking strategies, that feels outside the scope. It should be straightforward to handle that in user code? Or is there an argument for making this a built-in thing?
I do agree chunking feels outside of the scope (I hesitated to add it in the first place). But I ended up adding it because it's hard for me to separate all of these topics into buckets once I get to the implementation stage. It feels a bit like the microservices vs monolith debate to me, and I think my current bias is to think of these LM systems as a single thing vs isolated modules with single responsibility.
I think it's challenging during implementation because there are pretty strong dependencies between completions, prompts, context, and embeddings and it's quite difficult to know up front which strategies will work best for a specific use case (basically this is the problem I'm facing now thinking to myself "should I implement this myself or should it be in the library?"). Which leads me to wanting more flexibility for easily swapping out all of the pieces. I don't think that necessarily implies DSPy should be taking on all the responsibility though.
I think what I need to do before considering adding more to DSPy is create a full application that uses DSPy and do my own experimentation of swapping things in and out and see what happens.
@jhyearsley can you point out in the dspy
library where this was implemented?
Any plans to integrate Gemini and Bedrock? Independent of DSPy I'm working on testing different models, similar to what anyscale is doing in Building RAG-based LLM Applications for Production. I already have my own code to do this and many others have written similar code, but as I'm starting to use DSPy, I'm getting confused where the boundary of the library should be. It seems to me that DSPy would benefit from standardizing a bit more, but I'm not sure if that is the vision for the library.
Personally I'd prefer the library to allow me to more easily swap out models, chunking strategies, etc (again very similar to what is done in the blog post shared above). I see the library as an obvious replacement for langchain like libraries and many other variants attempting to do more or less the same thing. Except as it currently stands, I'm not sure exactly how I would integrate DSPy into my own workflow without adding some of the complexity which I'm trying to avoid. Any thoughts?