safety-evaluation Search Results

1000+ results
for safety-evaluation

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

sail-sg/sdft #12

多卡运行和参数问题

为什么项目使用多 GPU 运行导致推理结果乱码，得到的评估结果很差呢？请问是什么原因导致的呢另外一个问题是，论文说的是实验运用llama2的默认参数，比如温度等。但是实际推理时好像用的是llama-factory的参数，是0.95。而模型的默认温度是0.6。

zmr66z6xx6 updated 2 months ago
40
ShishirPatil/gorilla #738

[BFCL] Miss Param score is very low

Hi, I used the latest BFCL code to evaluate ToolACE-8B and other models, and found that the Miss Param score was very low, for example, ToolACE-8B was only 2.5%. Then I found that the common cause …

HumanMadeAgents updated 3 weeks ago
1
Jlisowskyy/Checkmate-Chariot #24

Optimalization of chess position evaluation function

# Main todos: - [ ] Check whether all move sorting table are correctly allocated, cleared or scaled, prepare some tests for them. Maybe clearing is not necessary - [ ] Prepare and test better coef …

Jlisowskyy updated 6 months ago
3
OWASP-BLT/BLT #1704

Company Cyber Security Dashboard Main Project Issue

# Task Description: Cyber Security Dashboard ## Background As a company dealing with sensitive data, it is crucial to keep track of our cyber security stats to ensure the safety and integrity of…

DonnieBLT updated 4 hours ago
14
shamilmamedov/flexible_arm #30

Training RL with safety filter

For completeness and to address one of the reviewers comments, we should also train RL, including the safety filter. @Erfi: Where in the current setup can I add the safety filter (which is kind of …

RudolfReiter updated 1 year ago
4
dart-lang/sdk #52511

New `co19/LanguageFeatures/nnbd/const_evaluation_A10_t01` te…

See [log](https://dart-ci.appspot.com/log/vm-kernel-linux-debug-x64/dartk-weak-asserts-linux-debug-x64/21449/co19/LanguageFeatures/nnbd/const_evaluation_A10_t01): ``` Unhandled exception: Expect.id…

mkustermann updated 11 months ago
3
finos/ai-readiness #56

Initial references for responsible AI

### Contact Details vicente.herrera@control-plane.io ### What is the idea During a working session we have started to talk about responsible AI, something we look forwar to start getting into…

vicenteherrera updated 2 months ago
2
meta-llama/llama-stack #444

Context retrieval only works for first user message

llama-stack install from source:https://github.com/meta-llama/llama-stack/tree/cherrypick-working ### System Info python -m "torch.utils.collect_env" /home/kaiwu/miniconda3/envs/llama/lib/pytho…

wukaixingxp updated 18 hours ago
2
kolhesamiksha/Hybrid-Search-RAG #1

[Feature] Enhancement and additional features to add

Model Performance Improvements - [x] change text-completion model from gpt-4o paid to groq-llama3.1-70b-versatile, reason: fast_inference, unpaid but rate limit. better for deployment. - [x] chang…

kolhesamiksha updated 3 months ago
1
kedro-org/kedro #3943

[DataCatalog]: Add a data schema evaluation mechanism

## Description Users express the need for data schema evaluation to enable "fail-fast" capabilities during data loading and consistency checks before execution. They highlight the potential benefits …

ElenaKhaustova updated 5 months ago
2

上一页 1...2 3 4 5 6 7 8...100 下一页

1000+ results for safety-evaluation

1000+ results
for safety-evaluation