KPI: How to do these measurements from the evaluation trajectories?

Erfi commented 1 year ago

Description: There is a kpi.py file that can do the following:

path_length
execution_time
constraint_violation

Constraint Violation (last one) is of importance to us since it will reflect the effectiveness of the safety filter.

Implementation: These functions (or at least the last one) needs to be updated so that they can work with the trajectories returned by the evaluation method.

The evaluation method should also be modified with a "return_trajectories" argument so it can return the trajectory of the robot during the evaluation procedure (currently it only returns the reward and episode length).

There also needs to be a script that loads every algorithm, evaluates it, runs the KPI measurements, adds results to a (json, yaml) file and creates a plot from the KPI measurements to compare the constraint violations of each algorithm.

Erfi commented 1 year ago

This issue is related to #24

shamilmamedov commented 1 year ago

How the trajectories are gonna look like?

Erfi commented 1 year ago

Trajectories will be of a list of type Trajectory. e.g. rollouts= [traj1, traj2] traj1.obs -> ndarray of size (n, 30) traj1.acts -> ndarray of size (n-1, 3)

where n is the length of the episode

Erfi commented 1 year ago

I can store the episodes for each trained model in a folder and we can load them and run the KPIs on them. @shamilmamedov do you think you could change the KPIs (especially the constraint violation function) to use a list of trajectories as input? I don't think we need to have access to the model (for getting the ee_pos from obs, since ee_pos is now part of the obs vector) referring to this line

shamilmamedov commented 1 year ago

Sure, can you save some trajectories and let me know where they are stored? I need them to test the code after modifying it

Erfi commented 1 year ago

I will push the code and the collected demos in a bit. (the models are not fully trained yet, so these are the demos that are collected for now in order to help with KPI pipeline)

Erfi commented 1 year ago

oki now folder demos has preliminary demos from each algorithm (5 trajectories per algorithm). test_kpi.py can load the demos using: python -m tests.test_kpi collect_demos=False

At the bottom of the test_kpi.py we can add the functions from kpi..py and plot things.

Erfi commented 1 year ago

@shamilmamedov If you could tell me how the result of the KPI functions look like, I can do the plotting part.

shamilmamedov commented 1 year ago

Thank you, I will work on the kpi pipeline later today. As soon as I decide on the output, I will let you know

shamilmamedov commented 1 year ago

It seems, I won't be able to compute constraint violation by the elbow because we don't have an elbow position as an observation. Ideally, we should include elbow position into the vector of observations. The only way to check it is using the "model" not the "real" robot.

I guess modifying the vector of observations is gonna affect many things down the way. Any ideas how to proceed?

Erfi commented 1 year ago

We do have access to the current q (joint angles) in every observation, if by "model" you mean the SymbolicFlexibleArm3DOF that is used (either by the env(10 seg) or by the Estimator/MPC(3 seg)) I can get that from the env and pass it as an input to the KPI functions. Would that work?

shamilmamedov commented 1 year ago

If you can pass the state of the simulator SymbolicFlexibleArm3DOF with 10 segments, then I can accurately compute the elbow possition and check if it collides with the wall and the ground

Erfi commented 1 year ago

hmmm...since the data is offline, I won't be able to do that, but what if we had a new instantiation of SymbolicFlexibleArm3DOF with 10 segments, given that we have the q and dq in every observation, can we set the state (or initialize) the SymbolicFlexibleArm3DOF using q and dq?

Erfi commented 1 year ago

Although q and dq are from the Estimator model (3 seg). so I don't think that works...

shamilmamedov commented 1 year ago

Using q and dq of the "model" with three segments is the of course possible. It's not gonna be as accurate as using the states of the "real" system.

Erfi commented 1 year ago

I don't have any great solutions for it atm, but would be nice if we could at least use the 3seg model (instead of the "real" one)

shamilmamedov / flexible_arm

KPI: How to do these measurements from the evaluation trajectories? #45