Closed KevinH48264 closed 8 months ago
One workaround was 1) changing the code for this because it seemed like the absolute path pointed to MLAgentBench/
The question of where is 'answer.csv' still persists though and prevents an actual evaluation.
I've made a workaround to get eval() working.
It seems like the code was expecting answer.csv to be either in the root directory (which doesn't make sense because it's never copied there), or it could be reasonable to think that it'll be under the step folder. However, answer.csv from my testing is downloaded into
My current workaround was to add an argument into the get_score() function for the answer file path pointing to _f"MLAgentBench/benchmarks/{benchmark_foldername}/scripts/answer.csv", but curious if this is an expected issue or if I'm missing something.
One workaround was 1) changing the code for this because it seemed like the absolute path pointed to MLAgentBench//env_log/../log, but the root directory should be instead of MLAgentBench and 2) copying and pasting main_log into the root directory as "log" -- I don't see a log file being generated and therefore it's throwing an error, I only have a main_log file generated. I don't think this is the best workaround, so would be interested in hearing thoughts on what might be happening and if there's a better fix.
The question of where is 'answer.csv' still persists though and prevents an actual evaluation.
We generate the MLAgentBench//env_log/../log file to catch all the output of the script as shown here: https://github.com/snap-stanford/MLAgentBench/blob/main/multi_run_experiment.sh#L44
I've made a workaround to get eval() working.
It seems like the code was expecting answer.csv to be either in the root directory (which doesn't make sense because it's never copied there), or it could be reasonable to think that it'll be under the step folder. However, answer.csv from my testing is downloaded into /scripts folder instead of /env folder so it'll never be copied into the workspace and the steps.
My current workaround was to add an argument into the get_score() function for the answer file path pointing to _f"MLAgentBench/benchmarks/{benchmark_foldername}/scripts/answer.csv", but curious if this is an expected issue or if I'm missing something.
Our script expects answer.csv to be generated in the same folder as ../scripts/eval.py and is supposed to work out of the box. Will check on this in my end.
This could be a Docker problem, but it seems unlikely. When I go to the eval.py for benchmarks/house-price/scripts, the "os.path.abspath("answer.csv")" prints to be MLAgentBench/answer.csv. This doesn't exist, and I assume the expected filepath was really MLAgentBench/benchmarks/house-price/scripts/answer.csv.
Maybe Docker sets the os.path to start from MLAgentBench, but I'm still not sure how the line "test_data = pd.read_csv("answer.csv")" in eval.py could possibly know the right path ("MLAgentBench/benchmarks/house-price/scripts/answer.csv") unless the user called the eval command from MLAgentBench/benchmarks/house-price/scripts in command line.
Yeah I think we were calling it directly from command. Could you push your workaround? Thanks!
I'm trying to evaluate the output of train.py created by my agent. The --log-dir flag I sent in the original command was "house-price-testing-v1-gpt4" which I assume should match the "--log-folder" flag.
I'm actually not sure what to expect from the eval command presented in the README. I assume that a score (% accuracy?) will be found in a results.json file (which will be in my root directory?).
But before I can even get a results.json file produced, I'm running into this error with the eval command. I was wondering if you'd be able to help provide more information about what to expect from the eval.py script and how to address this error?