Open WagnerJon opened 3 months ago
When the -run-all
flag is set, the content of the result files should be deleted (done here).
So it should only contain the header without any content:
model_name,subtask,score,iterations,md5_hash,datetime
Can you verify, that the content of the files is deleted?
Could you please debug the tests instead of running them with --run-all? If this is just about making sure that the benchmark runs, running one test should be enough. If you want to figure out why a test is failing or skipped, debugging will equally allow you to check. Same goes for file deletion etc.
Just to clarify: the expected behaviour is not that all tests are run, because some contain skip conditions (you can see them easily in the test code). It does not make sense to run "implicit" RAG evaluation with an "explicit" prompt, for instance: https://github.com/biocypher/biochatter/blob/1d27e3214fc96cef0833422bbb77b627970aaa45/benchmark/test_rag_interpretation.py#L73
Some benchmarks are being skipped even with inclusion of "--run-all" flag
To Reproduce Steps to reproduce the behaviour. "pytest benchmark --run-all"
Stack trace What is the exact error message? collected 106 items
benchmark/test_vectorstore_semantic_search.py ss [ 1%] benchmark/test_biocypher_query_generation.py ..s......s................................................ [ 56%] ................... [ 74%] benchmark/test_rag_interpretation.py .s.s.ss.s..s.s.s [ 89%] benchmark/test_biocypher_query_generation.py sssssssssss [100%]
================================= 83 passed, 23 skipped in 212.76s (0:03:32) ================================== Expected behavior All benchmarks should be run.
Desktop (please complete the following information): platform linux -- Python 3.10.12, pytest-8.0.2, pluggy-1.4.0
Additional context Add any other context about the problem here.