Closed nerfZael closed 4 months ago
Finished benchmarks Download artifacts
./autotx/tests/agents/token
autotx/tests/agents/token/
Test Name | Success Rate (%) | Passes | Fails | Avg Time | Avg Cost |
---|---|---|---|---|---|
research/test_advanced.py::test_research_and_swap_many_tokens_subjective_complex |
${\color{lightgreen} \large \texttt {67} \normalsize \texttt {(+57)} }$ | ${\color{lightgreen} \large \texttt {2}}$ | ${\color{lightgreen} \large \texttt {1}}$ | 4.83m | $1.72 |
research/test_advanced.py::test_research_and_swap_many_tokens_subjective_simple |
${\color{red} \large \texttt {0} \normalsize \texttt {(-90)} }$ | ${\color{red} \large \texttt {0}}$ | ${\color{red} \large \texttt {3}}$ | 1.39m | $0.54 |
research/test_research.py::test_get_top_5_memecoins |
${\color{lightgreen} \large \texttt {100} \normalsize \texttt {(+10)} }$ | ${\color{lightgreen} \large \texttt {3}}$ | ${\color{lightgreen} \large \texttt {0}}$ | 43s | $0.29 |
research/test_research.py::test_get_top_5_memecoins_in_optimism |
${\color{none} \large \texttt {100} \normalsize \texttt {} }$ | ${\color{none} \large \texttt {3}}$ | ${\color{none} \large \texttt {0}}$ | 49s | $0.35 |
research/test_research.py::test_get_top_5_most_traded_tokens_from_l1 |
${\color{none} \large \texttt {100} \normalsize \texttt {} }$ | ${\color{none} \large \texttt {3}}$ | ${\color{none} \large \texttt {0}}$ | 46s | $0.32 |
research/test_research.py::test_get_top_5_tokens_from_base |
${\color{red} \large \texttt {67} \normalsize \texttt {(-33)} }$ | ${\color{red} \large \texttt {2}}$ | ${\color{red} \large \texttt {1}}$ | 1.16m | $0.19 |
research/test_research.py::test_price_change_information |
${\color{none} \large \texttt {100} \normalsize \texttt {} }$ | ${\color{none} \large \texttt {3}}$ | ${\color{none} \large \texttt {0}}$ | 25s | $0.04 |
research/test_research_and_swap.py::test_research_and_buy_multiple |
${\color{red} \large \texttt {0} \normalsize \texttt {(-100)} }$ | ${\color{red} \large \texttt {0}}$ | ${\color{red} \large \texttt {3}}$ | 51s | $0.38 |
research/test_research_and_swap.py::test_research_and_buy_one |
${\color{red} \large \texttt {0} \normalsize \texttt {(-100)} }$ | ${\color{red} \large \texttt {0}}$ | ${\color{red} \large \texttt {3}}$ | 54s | $0.19 |
research/test_research_swap_and_send.py::test_research_buy_multiple_send_multiple |
${\color{red} \large \texttt {0} \normalsize \texttt {(-100)} }$ | ${\color{red} \large \texttt {0}}$ | ${\color{red} \large \texttt {3}}$ | 1.01m | $0.28 |
research/test_research_swap_and_send.py::test_research_buy_one_send_multiple |
${\color{red} \large \texttt {67} \normalsize \texttt {(-33)} }$ | ${\color{red} \large \texttt {2}}$ | ${\color{red} \large \texttt {1}}$ | 44s | $0.15 |
research/test_research_swap_and_send.py::test_research_buy_one_send_one |
${\color{red} \large \texttt {0} \normalsize \texttt {(-100)} }$ | ${\color{red} \large \texttt {0}}$ | ${\color{red} \large \texttt {3}}$ | 49s | $0.16 |
send/test_send.py::test_send_erc20 |
${\color{none} \large \texttt {100} \normalsize \texttt {} }$ | ${\color{none} \large \texttt {3}}$ | ${\color{none} \large \texttt {0}}$ | 19s | $0.00 |
send/test_send.py::test_send_erc20_parallel |
${\color{none} \large \texttt {100} \normalsize \texttt {} }$ | ${\color{none} \large \texttt {3}}$ | ${\color{none} \large \texttt {0}}$ | 22s | $0.01 |
send/test_send.py::test_send_eth_multiple |
${\color{none} \large \texttt {100} \normalsize \texttt {} }$ | ${\color{none} \large \texttt {3}}$ | ${\color{none} \large \texttt {0}}$ | 31s | $0.01 |
send/test_send.py::test_send_native |
${\color{none} \large \texttt {100} \normalsize \texttt {} }$ | ${\color{none} \large \texttt {3}}$ | ${\color{none} \large \texttt {0}}$ | 15s | $0.00 |
send/test_send.py::test_send_native_sequential |
${\color{none} \large \texttt {100} \normalsize \texttt {} }$ | ${\color{none} \large \texttt {3}}$ | ${\color{none} \large \texttt {0}}$ | 20s | $0.01 |
test_swap.py::test_swap_complex_1 |
${\color{none} \large \texttt {100} \normalsize \texttt {} }$ | ${\color{none} \large \texttt {3}}$ | ${\color{none} \large \texttt {0}}$ | 35s | $0.03 |
test_swap.py::test_swap_complex_2 |
${\color{red} \large \texttt {67} \normalsize \texttt {(-33)} }$ | ${\color{red} \large \texttt {2}}$ | ${\color{red} \large \texttt {1}}$ | 60s | $0.13 |
test_swap.py::test_swap_multiple_1 |
${\color{red} \large \texttt {0} \normalsize \texttt {(-100)} }$ | ${\color{red} \large \texttt {0}}$ | ${\color{red} \large \texttt {3}}$ | 1.24m | $0.11 |
test_swap.py::test_swap_multiple_2 |
${\color{none} \large \texttt {100} \normalsize \texttt {} }$ | ${\color{none} \large \texttt {3}}$ | ${\color{none} \large \texttt {0}}$ | 28s | $0.00 |
test_swap.py::test_swap_native |
${\color{none} \large \texttt {100} \normalsize \texttt {} }$ | ${\color{none} \large \texttt {3}}$ | ${\color{none} \large \texttt {0}}$ | 23s | $0.00 |
test_swap.py::test_swap_triple |
${\color{none} \large \texttt {100} \normalsize \texttt {} }$ | ${\color{none} \large \texttt {3}}$ | ${\color{none} \large \texttt {0}}$ | 48s | $0.00 |
test_swap.py::test_swap_with_non_default_token |
${\color{none} \large \texttt {100} \normalsize \texttt {} }$ | ${\color{none} \large \texttt {3}}$ | ${\color{none} \large \texttt {0}}$ | 22s | $0.00 |
test_swap_and_send.py::test_send_and_swap_complex |
${\color{red} \large \texttt {0} \normalsize \texttt {(-100)} }$ | ${\color{red} \large \texttt {0}}$ | ${\color{red} \large \texttt {3}}$ | 46s | $0.05 |
test_swap_and_send.py::test_send_and_swap_simple |
${\color{red} \large \texttt {33} \normalsize \texttt {(-67)} }$ | ${\color{red} \large \texttt {1}}$ | ${\color{red} \large \texttt {2}}$ | 29s | $0.02 |
test_swap_and_send.py::test_swap_and_send_complex |
${\color{red} \large \texttt {67} \normalsize \texttt {(-33)} }$ | ${\color{red} \large \texttt {2}}$ | ${\color{red} \large \texttt {1}}$ | 42s | $0.02 |
test_swap_and_send.py::test_swap_and_send_simple |
${\color{none} \large \texttt {100} \normalsize \texttt {} }$ | ${\color{none} \large \texttt {3}}$ | ${\color{none} \large \texttt {0}}$ | 26s | $0.01 |
Total run time: 70.16 minutes
/workflows/benchmarks agents/token 3