Psycoy / MixEval

The official evaluation suite and dynamic data release for MixEval.
https://mixeval.github.io/
222 stars 34 forks source link

How to submit my model to the Leaderboard? #27

Open Waneila opened 3 months ago

Psycoy commented 3 months ago

Hi, Waneila, you can run the test on MixEval-Hard and MixEval and give us the screenshot here Please make sure to adhere to the instructions in the repo

thanks Jinjie

Waneila commented 2 months ago

Hi Jinjie,

Here are the results of our model, Spark4.0, on MixEval-Hard and MixEval for the 20240601 version. Please include our model name in the leaderboard.

Thank you, and I wish you a pleasant day.

Best regards, Waneila

At 2024-07-26 08:34:53, "Jinjie Ni" @.***> wrote:

Hi, Waneila, you can run the test on MixEval-Hard and MixEval and give us the screenshot here Please make sure to adhere to the instructions in the repo

thanks Jinjie

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

Waneila commented 2 months ago

I have already replied to the issue via email and submitted our results. Please confirm if you have received them.

thanks Waneila

Psycoy commented 2 months ago

Hi @Waneila ,

I cannot see the results here, was it attached as an image?

Waneila commented 2 months ago

Hi Jinjie,

I'm sorry, the image was sent via email, so it might not display here. Let me resend the screenshot. Here are the results of our model, Spark4.0, on MixEval-Hard and MixEval for the 20240601 version. Please include our model name in the leaderboard.

mixeval_hard mixeval

Thank you, and I wish you a pleasant day.

Best regards, Waneila

Psycoy commented 2 months ago

Hi @Waneila ,

Is there any technical report / paper for your models? We will only include models that are known to the public to the leaderboard. If there is one, would you kindly give us a pointer? We will look into it. If not, maybe you can first indicate the results in the paper and contact us to add to the leaderboard as soon as it's released.

Have a nice day!

Waneila commented 2 months ago

Hi Jinjie,

This is the access address for our model: [https://xinghuo.xfyun.cn/spark](). You are welcome to visit this interface.

thanks Waneila

Waneila commented 1 month ago

Hi Jinjie,

We have already provided our model's homepage. When can it be added to the leaderboard approximately? If there are any issues, please contact me promptly.

thanks Waneila

Psycoy commented 1 month ago

Hi @Waneila ,

It's alr on the leaderboard. Please check if you could see it.

Waneila commented 1 month ago

Hi Jinjie,

We have already seen our model on the leaderboard, thanks for your support.

Waneila commented 4 weeks ago

Hi Jinjie,

Our model, Spark 4.0, has been updated to Spark 4.5. We are pleased to announce that compared to the previous version, our latest model has achieved significant improvements in MixEval-Hard and MixEval tasks. Here are the results of our latest model, Spark 4.5, on MixEval-Hard and MixEval for the 20240601 version. Please update our model in the leaderboard.

mixeval-hard mixeval

Thank you, and I wish you a nice day! Waneila

Waneila commented 3 weeks ago

Hi Jinjie,

I'm sorry, our latest model is named Spark 4.0-2024-10-14, not Spark 4.5. Please take note.

Thanks, Waneila

Psycoy commented 3 days ago

Hi Waneila,

Noted, we will update the results soon

Waneila commented 3 days ago

Hi Jinjie,

When can the update be completed?

Psycoy commented 1 day ago

Hi Jinjie,

When can the update be completed?

Hi @Waneila ,

We are currently reviewing some models to be added to the leaderboard. We will update the results once it's done.