Open shreyas-shinde opened 1 week ago
FYI: I have actually tried the numbered approach and have seen cost, latency go down because of output token reduction and also found that the eval score of downstream task (QA in my case) went up.
@shreyas-shinde feel free to contribute!
@shreyas-shinde So is the process to:
You can check the how to here