cheshire-cat-ai / core

Production ready AI agent framework
https://cheshirecat.ai
GNU General Public License v3.0
2.14k stars 282 forks source link

Fix classify method in stray #859

Closed lucagobbi closed 1 week ago

lucagobbi commented 2 weeks ago

Description

Currently the cat.classify() method in StrayCat is looping over labels and checking if every label is a substring of the response (was this logic intentional?) which is the string classified by the LLM. This could lead to bugs if one defines his/her labels as substrings of one another. For instance, if you define your labels as: ['positive', 'not_so_positive', 'negative'], even if the LLM classifies correctly the sentence as not_so_positive the method will return positive, since the label is a substring of the response.

I've also optimized type hinting and type checking for better readability.

Type of change

Checklist:

pieroit commented 2 weeks ago

Agree on the correction, but now it could happen that the LLM writes more chars then needed (i.e. adding quotes or punctuation) and it will not work :/

What about using utils.levhenstein_distance?

lucagobbi commented 2 weeks ago

Sure, I agree on a fuzzy system like the one you proposed. Plus, we should encourage the use of non similar labels to avoid mistakes like these (via documentation). I find these cat methods really useful, they deserve more space in the docs. Will update the PR. Thanks Piero!!!

lucagobbi commented 2 weeks ago

Ushh in the previous version we were not forcing not classified responses, since the method was returning None if no label was matched. With levhenstein_distance we are forcing the nearest label to be returned even if the LLM answer with an outlier like:

labels: ['positive', 'not_so_positive', 'negative'] response: "none" result: positive

It's a strong stance to force it. What do you think?

pieroit commented 1 week ago

@lucagobbi maybe nearest label with a threshold? Let's do some experiment and take a direction, thanks for this!

lucagobbi commented 1 week ago

@lucagobbi maybe nearest label with a threshold? Let's do some experiment and take a direction, thanks for this!

I havent added the levhenstein method yet, need to reopen this PR if we want to include it