bigcode-project / bigcodebench

BigCodeBench: Benchmarking Code Generation Towards AGI
https://bigcode-bench.github.io/
Apache License 2.0
214 stars 22 forks source link

[REQUEST] Remove Reflection-Llama-3.1-70B #51

Closed wakamex closed 1 week ago

wakamex commented 2 weeks ago

Rationale

It's been proven to be a fraud. Their CEO has repeatedly lied and not fully come clean about:

  1. Uploading weights identical to Llama 3
  2. Faking API responses using Sonnet and ChatGPT

Sources: https://x.com/MaziyarPanahi/status/1838559480658710982 https://x.com/Yuchenj_UW/status/1833627813552992722

The repo referenced in the request to add Reflection has since been deleted.

The repo that hosts the re-uploaded weights warns the original benchmarks can't be trusted: image

The founder now admits the model doesn't work: https://x.com/mattshumer_/status/1842313328166907995

Proof the initial released weights are just LoRa applied to Llama 3: https://www.reddit.com/r/LocalLLaMA/comments/1fb6jdy/reflectionllama3170b_is_actually_llama3/

terryyz commented 1 week ago

Completed