Closed cfcv closed 2 years ago
Hey @cfcv, thanks for the report. I believe the cause is a small and very silly bug we found recently where changing the experiment name breaks metrics aggregation. The fix will be published soon -- in the meanwhile could you re-run it with the default experiment_name
and group
and let me know if you have further issues?
Hi Michael, thanks for the answear. However, changing the experiment name and group to the same as the training does not solved the problem and the evaluation is not able to generate the aggregator metric file.
I am joining my train and evaluation python scripts to this message so you can have a look and see if I am doing something wrong. (I changed the extension to .txt to be able to attach it)
In addition, could you point out the code file that creates the aggregator metric file or the function that is supposed to create it? With this info I could try to debug on my side to understand what is happening.
Hey @cfcv, sorry I should've been more clear -- could you not pass any experiment name at all? After I commented it out, metric aggregation worked:
cfg = hydra.compose(config_name=CONFIG_NAME, overrides=[
# f'experiment_name={EXPERIMENT}',
# f'group={SAVE_DIR}',
'planner=ml_planner',
'model=raster_model',
'planner.ml_planner.model_config=${model}', # hydra notation to select model config
f'planner.ml_planner.checkpoint_path={MODEL_PATH}', # this path can be replaced by the checkpoint of the model trained in the previous section
f'+simulation={CHALLENGE}',
*DATASET_PARAMS,
])
metrics aggregation happens in a callback here -- an alternate temporary fix is replacing
metrics = self._metric_save_path.rglob("*.parquet")
if metric_aggregator.challenge is None:
challenge_metrics = list(metrics)
else:
challenge_metrics = [path for path in metrics if metric_aggregator.challenge in str(path)]
with challenge_metrics = list(self._metric_save_path.rglob("*.parquet"))
. You do have to be careful when running multiple challenges, so I'd recommend just not using experiment name for the time being if that's possible. I'll ping you on this issue once it's resolved.
The following work-around worked for me:
Within the experiment folder, there is a directory called aggregator_metric
. In there, create a subdirectory for the corresponding challenge e.g. closed_loop_reactive_agents
.
Copy all parquet files from the metric
directory into the newly created challenge directory. Once you have done that, you can use the run_metric_aggregator
script to generate the aggregated metrics. The final file should then be located in the aggregator_metric
folder.
As far as I have figure this out, a problem ist that the metric aggregator is searching for the metric files in the wrong place. Maybe this can even be solved by just adapting some paths in the configs.
Let me know if this did work for you.
Thanks for the Clarification @michael-motional, it worked!
@mh0797 thanks for the additional solution, I'll try it out too
Great! I'll close the issue, but let you know here once the proper fix is finalized :)
Description
Hi, I am trying to evaluate in Open Loop the Raster model that I trained following the notebook tutorial. However, the evaluation script is not generating the aggregator metric file, even though the metric aggregator is set to open_loop_boxes_weighted_average. As a consequence, I am not able to visualize the results on nuboard (overview window).
I am running the following python code:
Stack Trace
We can see that there is a warning that is logged for not founding the metric files.
Additional information:
Any clues of what could be happening? Thanks in advance :)
log.txt