ngruver / llmtime

https://arxiv.org/abs/2310.07820
MIT License
673 stars 157 forks source link

New models, Google Gemini, some adjustements #25

Open zaizou opened 7 months ago

zaizou commented 7 months ago

In this update,

Please note that the NLL calculation is not yet implemented for google, also for the past mistral model.

zaizou commented 7 months ago

Are you we okey ?

ngruver commented 7 months ago

Hello,

I think I will need to update the readme (beyond what is proposed in the PR) to make it more clear now that there are many model options. I can take a look this weekend.

zaizou commented 7 months ago

Okey, Maybe also working on the architecture (OOP), so when a company modify or add a model (API) (Mistral large, and gemini ultra for instance) we don't have to code. It will be interesting that we discuss about the feature evolutions if it will some. Maybe some integration with other research and thesis. Let me know if you're open for a collaboration I have some ideas 👍

ngruver commented 7 months ago

I'm reading through your PR and I noticed there might be an issue in the last set of changes. In the code for the mistral API, it looks like you are using the OpenAI tokenizers: https://github.com/ngruver/llmtime/blob/main/models/mistral_api.py#L33. Did you confirm that this makes? It probably makes more sense to use the huggingface mistral tokenizers, unless it is known that their API models use a different tokenizer.

ngruver commented 7 months ago

Other issues/comments:

zaizou commented 7 months ago

Hello, Of course all of your remark are good I wanted to pass the system message as a parameter but I didn't have much time, I thought also about making things OOP To two other points will be done soon

zaizou commented 7 months ago

Also for Mistral tokenizer ok Tell me if you wanna do it yourself