-
Hi,
I am trying to run some LLMs (currently trying openai models) on MMLU. My first question is which configuration is the standard setup (5 shot without CoT)? What does flan mean in some of the c…
-
@epage thank you for [the mention](https://news.ycombinator.com/item?id=42079137) earlier this week. I wanted to follow up as a lot of progress has been made since we last spoke. Bencher can now creat…
-
Write a testing script for GameEngine to allow us to build until the tests pass.
-
```
How many times did the compiler broke?
How many times did someone said "we" should write a test harness for the
compiler?
Ha, if only we had a janitor to do that for us.
```
Original issue repo…
-
Replace bundled dev wallet with https://github.com/onflow/fcl-next-harness.
-
Scope test harnesses and implement.
-
As mentioned in #18 we should be improving the test harness. I think we should be leveraging parts of the `twilio-run` tool for this since it is already putting effort in simulating the Twilio Runtime…
-
Should include example of overwriting datasource
See https://github.com/lightblue-platform/lightblue-core/pull/463
See https://github.com/lightblue-platform/lightblue-core/pull/462
-
The test harness would need to cover the following:
- Elasticsearch 7
- Elasticsearch 8
- OpenSearch
- TLS communication with indexing services
-
Error Summary for Test Case: IDM-10.2
Details: Problems with conformance identified during validation.
DUT : tv-app
- [ ] Unknown Cluster (1875):
The logs indicate that commands with unknown di…