brunofavs commented 2 months ago

I'm writing this issue to document the batch of experiments I'm planning to run soon on softbot.

The goal is to use a big dataset and test the effect of different levels of noise and number of collections in the odometry transformation.

[x] Record bagfile
[x] Collect dataset
[x] Label detections
[x] Annotate detections (for evaluation)
[x] Design experiments

brunofavs commented 2 months ago

Bagfile recorded :

Note

The bagfile size/duration is low because I paused between moving positions for each taken collection. I used Gazebo's move tool to move to robot around roughly in this area :

I made sure to have positions evenly scattered in that area. This was a previous mistake of mine in the softbot atom example.

path:        /home/bruno/bagfiles/softbot/long_bag/long_bag1.bag
version:     2.0
duration:    1:44s (104s)
start:       Jan 01 1970 01:34:24.91 (2064.91)
end:         Jan 01 1970 01:36:09.04 (2169.04)
size:        488.7 MB
messages:    14941
compression: none [545/545 chunks]
types:       sensor_msgs/CameraInfo      [c9a58c1b0b154e0e6da7578cb991d214]
             sensor_msgs/CompressedImage [8f7a12909da2c9d3332d540a0977563f]
             sensor_msgs/JointState      [3066dcd76a6cfaef579bd0f34173e9fd]
             sensor_msgs/PointCloud2     [1158d486dd51d683ce2f1be655c3c181]
             tf2_msgs/TFMessage          [94810edda583a504dfda3829e70d7eec]
topics:      /front_left_camera/rgb/camera_info             1133 msgs    : sensor_msgs/CameraInfo     
             /front_left_camera/rgb/image_raw/compressed    1134 msgs    : sensor_msgs/CompressedImage
             /front_right_camera/rgb/camera_info            1132 msgs    : sensor_msgs/CameraInfo     
             /front_right_camera/rgb/image_raw/compressed   1133 msgs    : sensor_msgs/CompressedImage
             /joint_states                                  3122 msgs    : sensor_msgs/JointState     
             /lidar3d/points                                1042 msgs    : sensor_msgs/PointCloud2    
             /tf                                            6244 msgs    : tf2_msgs/TFMessage          (2 connections)
             /tf_static                                        1 msg     : tf2_msgs/TFMessage

Dataset collected :

Loaded dataset containing 3 sensors and 44 collections.
Dataset contains 3 sensors: ['front_left_camera', 'front_right_camera', 'lidar3d']
Dataset contains 1 patterns: ['pattern_1']
Selected collection key is 000
Complete collections (41):['000', '001', '002', '003', '005', '006', '007', '008', '009', '010', '011', '012', '013', '014', '015', '016', '017', '018', '019', '020', '021', '022', '023', '024', '025', '026', '028', '029', '030', '031', '032', '033', '034', '035', '037', '038', '039', '040', '041', '042', '043']
Incomplete collections (3):['004', '027', '036']
True
Sensor front_left_camera has 0 complete detections of pattern pattern_1: []
Sensor front_left_camera has 44 partial detections of pattern pattern_1: ['000', '001', '002', '003', '004', '005', '006', '007', '008', '009', '010', '011', '012', '013', '014', '015', '016', '017', '018', '019', '020', '021', '022', '023', '024', '025', '026', '027', '028', '029', '030', '031', '032', '033', '034', '035', '036', '037', '038', '039', '040', '041', '042', '043']
True
Sensor front_right_camera has 0 complete detections of pattern pattern_1: []
Sensor front_right_camera has 44 partial detections of pattern pattern_1: ['000', '001', '002', '003', '004', '005', '006', '007', '008', '009', '010', '011', '012', '013', '014', '015', '016', '017', '018', '019', '020', '021', '022', '023', '024', '025', '026', '027', '028', '029', '030', '031', '032', '033', '034', '035', '036', '037', '038', '039', '040', '041', '042', '043']
Sensor lidar3d is not a camera. All detections are complete.

Analysis for pattern pattern_1

Collections
+------------+-------------+-------------------+--------------------+----------+
| Collection | is complete | front_left_camera | front_right_camera | lidar3d  |
+------------+-------------+-------------------+--------------------+----------+
|    000     |     yes     |      partial      |      partial       | detected |
|    001     |     yes     |      partial      |      partial       | detected |
|    002     |     yes     |      partial      |      partial       | detected |
|    003     |     yes     |      partial      |      partial       | detected |
|    004     |      no     |      partial      |    not detected    | detected |
|    005     |     yes     |      partial      |      partial       | detected |
|    006     |     yes     |      partial      |      partial       | detected |
|    007     |     yes     |      partial      |      partial       | detected |
|    008     |     yes     |      partial      |      partial       | detected |
|    009     |     yes     |      partial      |      partial       | detected |
|    010     |     yes     |      partial      |      partial       | detected |
|    011     |     yes     |      partial      |      partial       | detected |
|    012     |     yes     |      partial      |      partial       | detected |
|    013     |     yes     |      partial      |      partial       | detected |
|    014     |     yes     |      partial      |      partial       | detected |
|    015     |     yes     |      partial      |      partial       | detected |
|    016     |     yes     |      partial      |      partial       | detected |
|    017     |     yes     |      partial      |      partial       | detected |
|    018     |     yes     |      partial      |      partial       | detected |
|    019     |     yes     |      partial      |      partial       | detected |
|    020     |     yes     |      partial      |      partial       | detected |
|    021     |     yes     |      partial      |      partial       | detected |
|    022     |     yes     |      partial      |      partial       | detected |
|    023     |     yes     |      partial      |      partial       | detected |
|    024     |     yes     |      partial      |      partial       | detected |
|    025     |     yes     |      partial      |      partial       | detected |
|    026     |     yes     |      partial      |      partial       | detected |
|    027     |      no     |    not detected   |      partial       | detected |
|    028     |     yes     |      partial      |      partial       | detected |
|    029     |     yes     |      partial      |      partial       | detected |
|    030     |     yes     |      partial      |      partial       | detected |
|    031     |     yes     |      partial      |      partial       | detected |
|    032     |     yes     |      partial      |      partial       | detected |
|    033     |     yes     |      partial      |      partial       | detected |
|    034     |     yes     |      partial      |      partial       | detected |
|    035     |     yes     |      partial      |      partial       | detected |
|    036     |      no     |    not detected   |      partial       | detected |
|    037     |     yes     |      partial      |      partial       | detected |
|    038     |     yes     |      partial      |      partial       | detected |
|    039     |     yes     |      partial      |      partial       | detected |
|    040     |     yes     |      partial      |      partial       | detected |
|    041     |     yes     |      partial      |      partial       | detected |
|    042     |     yes     |      partial      |      partial       | detected |
|    043     |     yes     |      partial      |      partial       | detected |
+------------+-------------+-------------------+--------------------+----------+

I'm going to correct the labeling now.

miguelriemoliveira commented 2 months ago

Great issue.

brunofavs commented 2 months ago

Ignore this comment, it was a simple oversight

Just leaving it here in case I fall in the mistake somewhere down the line again

After labeling half of the dataset, I tried to test with a collection selection function to filter only the labeled collections, and the results weren't great.

But the transformation from world to baselink wasn't listed on the optimization parameters, which I thought was already sorted out :

(I inserted a print because for some reason it wasn't copying over the \n and was a mess)

brunofavs commented 2 months ago

Well, it was just a oopsie. I forgot to add the odom tf to the additional tf's tab in the config.yml.

Now I'm getting a different problem : ATOM Error: Config file has duplicate transform(s) ['lidar3d_plate_link-lidar3d_base_link']. Invalid configuration.

I'm going to investigate

brunofavs commented 2 months ago

Hey @miguelriemoliveira Got some great news : With -nig 0.1m 0.1rad, and odom noise 0.05m 0rad softbot was calibrated to some degree of success :

The reprojection errors started at over 200 and finished like this :

+------------+------------------------+-------------------------+-------------+
| Collection | front_left_camera [px] | front_right_camera [px] | lidar3d [m] |
+------------+------------------------+-------------------------+-------------+
|    000     |         2.3912         |          1.9962         |    0.0064   |
|    001     |         0.9426         |          0.8107         |    0.0062   |
|    002     |         6.5780         |          6.0070         |    0.0062   |
|    003     |         1.6523         |          0.9756         |    0.0061   |
|    005     |         2.9630         |          2.1225         |    0.0058   |
|    006     |         8.7323         |          8.2241         |    0.0061   |
|    007     |         3.6090         |          3.5250         |    0.0063   |
|    008     |         1.0812         |          0.8477         |    0.0063   |
|    009     |         2.2908         |          1.8974         |    0.0059   |
|    010     |         3.4449         |          3.4796         |    0.0060   |
|    011     |         1.1233         |          0.9750         |    0.0058   |
|    012     |         1.6724         |          1.6908         |    0.0058   |
|    013     |         2.3859         |          1.7050         |    0.0056   |
|    014     |         1.4609         |          1.3646         |    0.0055   |
|    015     |         3.6605         |          3.6958         |    0.0057   |
|    016     |         1.7984         |          1.8748         |    0.0056   |
|    017     |         4.3426         |          4.2594         |    0.0060   |
|  Averages  |         2.9488         |          2.6736         |    0.0060   |
+------------+------------------------+-------------------------+-------------+

The sofbot poses in rviz also started super randomly scattered, some even below the floor and they ended like this :

I discussed it with @manuelgitgomes and @Kazadhum and they gave me some useful suggestions :

Should be valid to assume the robot is in a plane, hence it should equally be valid that we could

Add noise only to x,y,yaw
Add only x,y,yaw parameters to the optimizer, (like we do with the joints)

Maybe we should meet to talk a bit about this and the next course of action @miguelriemoliveira. Are you available sometime this week?

miguelriemoliveira commented 2 months ago

Maybe we should meet to talk a bit about this and the next course of action @miguelriemoliveira. Are you available sometime this week?

Hi @brunofavs ,

looks good. Can we talk Thursday 9h?

One thing: you should use the -ctgt flag since you are in simulation. It will output the error w.r.t. the ground truth for all estimated params.

Add noise only to x,y,yaw

I think we would also need to define the coordinate system in which the error is added. It as to be a frame where the z is the vertical ... In any case it sounds very specific to this case, so perhasp I would give it lower priority.

miguelriemoliveira commented 2 months ago

ATOM Error: Config file has duplicate transform(s) ['lidar3d_plate_link-lidar3d_base_link']. Invalid configuration.

Did you solve this?

brunofavs commented 2 months ago

Did you solve this?

Yes (20502bf), it was a small error in this line : https://github.com/lardemua/atom/blob/20502bfdbfb71603d30a6c4bd41082e4d187e074/atom_calibration/scripts/configure_calibration_pkg#L575

brunofavs commented 2 months ago

looks good. Can we talk Thursday 9h?

Sounds good to me :)

One thing: you should use the -ctgt flag since you are in simulation. It will output the error w.r.t. the ground truth for all estimated params.

That's a good idea, I will add that flag on my next run.

I think we would also need to define the coordinate system in which the error is added. It as to be a frame where the z is the vertical ... In any case it sounds very specific to this case, so perhasp I would give it lower priority.

I'm not quite so sure what you meant by this statement. Isn't it a valid assumption to admit the world frame always has z pointing upwards? I understand if this isn't the case, there might be problems

brunofavs commented 2 months ago

I got some weird behavior from -ctgt

The transformations to the rgb cameras are not appearing
The initial translation error on the odom isn't 5cm like I specified
The initial rotation error on the odom is almost 10cm even though I didn't even add noise to it.

miguelriemoliveira commented 2 months ago

I got some weird behavior from -ctgt

issue please ...

miguelriemoliveira commented 2 months ago

'm not quite so sure what you meant by this statement. Isn't it a valid assumption to admit the world frame always has z pointing upwards? I understand if this isn't the case, there might be problems

its not a valid assumption I think.

brunofavs commented 2 months ago

Update from the last few days :

I finished labeling and annotating the dataset. The annoying cumbersome part is over.

Here is the inspection of the dataset :

+------------+-------------+-------------------+--------------------+----------+
| Collection | is complete | front_left_camera | front_right_camera | lidar3d  |
+------------+-------------+-------------------+--------------------+----------+
|    000     |     yes     |      partial      |      partial       | detected |
|    001     |     yes     |      partial      |      partial       | detected |
|    002     |     yes     |      partial      |      partial       | detected |
|    003     |     yes     |      partial      |      partial       | detected |
|    004     |      no     |      partial      |    not detected    | detected |
|    005     |     yes     |      partial      |      partial       | detected |
|    006     |     yes     |      partial      |      partial       | detected |
|    007     |     yes     |      partial      |      partial       | detected |
|    008     |     yes     |      partial      |      partial       | detected |
|    009     |     yes     |      partial      |      partial       | detected |
|    010     |     yes     |      partial      |      partial       | detected |
|    011     |     yes     |      partial      |      partial       | detected |
|    012     |     yes     |      partial      |      partial       | detected |
|    013     |     yes     |      partial      |      partial       | detected |
|    014     |     yes     |      partial      |      partial       | detected |
|    015     |     yes     |      partial      |      partial       | detected |
|    016     |     yes     |      partial      |      partial       | detected |
|    017     |     yes     |      partial      |      partial       | detected |
|    018     |     yes     |      partial      |      partial       | detected |
|    019     |     yes     |      partial      |      partial       | detected |
|    020     |     yes     |      partial      |      partial       | detected |
|    021     |     yes     |      partial      |      partial       | detected |
|    022     |     yes     |      partial      |      partial       | detected |
|    023     |     yes     |      partial      |      partial       | detected |
|    024     |     yes     |      partial      |      partial       | detected |
|    025     |     yes     |      partial      |      partial       | detected |
|    026     |     yes     |      partial      |      partial       | detected |
|    027     |      no     |    not detected   |      partial       | detected |
|    028     |     yes     |      partial      |      partial       | detected |
|    029     |     yes     |      partial      |      partial       | detected |
|    030     |     yes     |      partial      |      partial       | detected |
|    031     |     yes     |      partial      |      partial       | detected |
|    032     |     yes     |      partial      |      partial       | detected |
|    033     |     yes     |      partial      |      partial       | detected |
|    034     |     yes     |      partial      |      partial       | detected |
|    035     |     yes     |      partial      |      partial       | detected |
|    036     |      no     |    not detected   |      partial       | detected |
|    037     |     yes     |      partial      |      partial       | detected |
|    038     |     yes     |      partial      |      partial       | detected |
|    039     |     yes     |      partial      |      partial       | detected |
|    040     |     yes     |      partial      |      partial       | detected |
|    041     |     yes     |      partial      |      partial       | detected |
|    042     |     yes     |      partial      |      partial       | detected |
|    043     |     yes     |      partial      |      partial       | detected |
+------------+-------------+-------------------+--------------------+----------+

Now I have been studying the batch experiments scripts and started to plan out the experiments.

As per @manuelgitgomes' suggestion, I'm going to use the stratified-k-fold method for cross validation to give statistic strength to my experiments:

I'm going to divide my datasets in 3 folds, and I chose 70% train size. These numbers were chosen somewhat randomly. I used the usual 70-20-10 rule often used in ML models. For the number of folds I searched in a few papers but couldn't find anything conclusive. I reckon its too dataset dependent.

I plan to run experiments with noise in the sensors (nig) and in the odom tf (ntfv) varying from 0/0 to 0.5/0.5 (m/rad). To accomplish this I'm executing batches of experiments where in each batch the nig is fixed and ntfv varies.

This is my data.yml atm :

#
#           █████╗ ████████╗ ██████╗ ███╗   ███╗
#          ██╔══██╗╚══██╔══╝██╔═══██╗████╗ ████║
#          ███████║   ██║   ██║   ██║██╔████╔██║
#          ██╔══██║   ██║   ██║   ██║██║╚██╔╝██║
#   __     ██║  ██║   ██║   ╚██████╔╝██║ ╚═╝ ██║    _
#  / _|    ╚═╝  ╚═╝   ╚═╝    ╚═════╝ ╚═╝     ╚═╝   | |
#  | |_ _ __ __ _ _ __ ___   _____      _____  _ __| | __
#  |  _| '__/ _` | '_ ` _ \ / _ \ \ /\ / / _ \| '__| |/ /
#  | | | | | (_| | | | | | |  __/\ v  v / (_) | |  |   <
#  |_| |_|  \__,_|_| |_| |_|\___| \_/\_/ \___/|_|  |_|\_\
#  https://github.com/lardemua/atom

# this yml file contains variables to be used in conjunction with batch.yml

# Auxiliary variables, to be used to render other fields in the template.yml.j2 file
package_path: "package://softbot_calibration"
dataset_path: '$ATOM_DATASETS/softbot/long_train_dataset1'
collections_to_remove: [27,36,37]

# Runs are repetitions of the experiments for gathering statistically significant results
runs: [1,2,3,4,5,6,7,8,9,10]

cross_validation:
  type: "stratified-k-fold" 
  n_splits: 3 # Number of folds
  train_size: 0.7 # Percentage of the dataset used for training, only used in StratifiedShuffleSplit

# Experiments are executions with a set of input parameters
experiments:

# Varying noise from 0 to 0.5m/rad

  # Perfect Simulation
  - {name: perfect_sim, nig_value: 0.0, ntfv:0.0}

  # Varying nig only
  - {name: nig_0.1, nig_value: 0.1, ntfv:0.0}
  - {name: nig_0.2, nig_value: 0.2, ntfv:0.0}
  - {name: nig_0.3, nig_value: 0.3, ntfv:0.0}
  - {name: nig_0.4, nig_value: 0.4, ntfv:0.0}
  - {name: nig_0.5, nig_value: 0.5, ntfv:0.0}

    #Varying ntfv with fixed nig = 0.1
  - {name: nig_0.1-ntfv_0.1, nig_value: 0.1, ntfv:0.1}
  - {name: nig_0.1-ntfv_0.2, nig_value: 0.1, ntfv:0.2}
  - {name: nig_0.1-ntfv_0.3, nig_value: 0.1, ntfv:0.3}
  - {name: nig_0.1-ntfv_0.4, nig_value: 0.1, ntfv:0.4}
  - {name: nig_0.1-ntfv_0.5, nig_value: 0.1, ntfv:0.5} 

    #Varying ntfv with fixed nig = 0.2
  - {name: nig_0.2-ntfv_0.1, nig_value: 0.2, ntfv:0.1}
  - {name: nig_0.2-ntfv_0.2, nig_value: 0.2, ntfv:0.2}
  - {name: nig_0.2-ntfv_0.3, nig_value: 0.2, ntfv:0.3}
  - {name: nig_0.2-ntfv_0.4, nig_value: 0.2, ntfv:0.4}
  - {name: nig_0.2-ntfv_0.5, nig_value: 0.2, ntfv:0.5} 

    #Varying ntfv with fixed nig = 0.3
  - {name: nig_0.3-ntfv_0.1, nig_value: 0.3, ntfv:0.1}
  - {name: nig_0.3-ntfv_0.2, nig_value: 0.3, ntfv:0.2}
  - {name: nig_0.3-ntfv_0.3, nig_value: 0.3, ntfv:0.3}
  - {name: nig_0.3-ntfv_0.4, nig_value: 0.3, ntfv:0.4}
  - {name: nig_0.3-ntfv_0.5, nig_value: 0.3, ntfv:0.5} 

    #Varying ntfv with fixed nig = 0.4 
  - {name: nig_0.4-ntfv_0.1, nig_value: 0.4, ntfv:0.1}
  - {name: nig_0.4-ntfv_0.2, nig_value: 0.4, ntfv:0.2}
  - {name: nig_0.4-ntfv_0.3, nig_value: 0.4, ntfv:0.3}
  - {name: nig_0.4-ntfv_0.4, nig_value: 0.4, ntfv:0.4}
  - {name: nig_0.4-ntfv_0.5, nig_value: 0.4, ntfv:0.5} 

    #Varying ntfv with fixed nig = 0.5 
  - {name: nig_0.5-ntfv_0.1, nig_value: 0.5, ntfv:0.1}
  - {name: nig_0.5-ntfv_0.2, nig_value: 0.5, ntfv:0.2}
  - {name: nig_0.5-ntfv_0.3, nig_value: 0.5, ntfv:0.3}
  - {name: nig_0.5-ntfv_0.4, nig_value: 0.5, ntfv:0.4}
  - {name: nig_0.5-ntfv_0.5, nig_value: 0.5, ntfv:0.5}

I'm now going to write the template.yml.j2 to plan out which evaluation arguments I'll be using.

brunofavs commented 2 months ago

Uuhh I stumbled at what I think is a super newbie question this far down the line.

What exactly is the difference between atom_calibration.json and dataset_corrected.json ? The first is a byproduct of the calibrate script. I suppose it was just the dataset with the updated calibrated transforms.

But then I saw this past issue #870 and there it seems you @miguelriemoliveira are using both and achieving the same outputs so I'm confused.

This doubt came because I'm not sure whether I should use one json or the other in the evaluations :

https://github.com/lardemua/atom/blob/92b9a432cf616516159a8cae7aac382a4d8ae874/atom_batch_execution/experiments/rrbot_example/template.yml.j2#L36

miguelriemoliveira commented 2 months ago

Hi @brunofavs ,

looks like we're moving forward. Congrats.

Some questions:

These numbers were chosen somewhat randomly.

You should think of some better reason. For now its now important, but in the thesis you will need a much better explanation. Might as well come up with it now, before running the tests.

I used the usual 70-20-10 rule often used in ML models.

I do not understand what the 70-20-10 means in this context. This is usually train-validation-test, but in our experiments we only have train-test.

Runs are repetitions of the experiments for gathering statistically significant results runs: [1,2,3,4,5,6,7,8,9,10]

I thought the k-fold would replace these runs. @manuelgitgomes should we have this?

What do you use the validation for? Perhaps this has to do with @manuelgitgomes .

What exactly is the difference between atom_calibration.json and dataset_corrected.json ?

atom_calibration.json is the result of calibration using atom. dataset_corrected.json is the result of running dataset playback to correct some labels.

I would expect you use the dataset_corrected.json as train dataset (input) for calibration, and receive as output a atom_calibration.json.

If #870 says otherwise then its wrong.

manuelgitgomes commented 2 months ago

Hello. I think you are misusing the cross validation @brunofavs.

The train_size is only used for Statified Shuffle Spin. For K-Folds, the number of folds dictates the validation size: 5 folds divides the dataset into 5, giving a 20% size for validation. I suggest you dividing the dataset into 5 folds.

I agree with @miguelriemoliveira, we do not have a division of train-validation-test. We only need to divide the dataset into two: train and test/validation.

miguelriemoliveira commented 2 months ago

We only need to divide the dataset into two: train and test/validation.

Let's call it train/test which is what we called it before.

brunofavs commented 2 months ago

Hello. I think you are misusing the cross validation @brunofavs.

The train_size is only used for Statified Shuffle Spin. For K-Folds, the number of folds dictates the validation size: 5 folds divides the dataset into 5, giving a 20% size for validation. I suggest you dividing the dataset into 5 folds.

I agree with @miguelriemoliveira, we do not have a division of train-validation-test. We only need to divide the dataset into two: train and test/validation.

Hi @manuelgitgomes

Ok it makes sense, I was confusing the methods. will divide it into 5 folds.

brunofavs commented 2 months ago

Hey @miguelriemoliveira

You should think of some better reason. For now its now important, but in the thesis you will need a much better explanation. Might as well come up with it now, before running the tests.

Yes you are right, I'm investigating a good source for those numbers.

I would expect you use the dataset_corrected.json as train dataset (input) for calibration, and receive as output a atom_calibration.json.

Yes for the calibration I've always used the dataset.json ( the other isn't even there before the 1st calibration). But regarding the evaluation I'm not sure which one to use. I'm almost certain I should use the atom_calibration.json like it is here : https://github.com/lardemua/atom/blob/92b9a432cf616516159a8cae7aac382a4d8ae874/atom_batch_execution/experiments/rrbot_example/template.yml.j2#L36

I just wanted to confirm

brunofavs commented 2 months ago

Hey.

I finished the initial draft for the template. The plan is to run evaluation between all possible pairs of sensors aka rgb to rgb, lidar to left and lidar to right camera :

#
#           █████╗ ████████╗ ██████╗ ███╗   ███╗
#          ██╔══██╗╚══██╔══╝██╔═══██╗████╗ ████║
#          ███████║   ██║   ██║   ██║██╔████╔██║
#          ██╔══██║   ██║   ██║   ██║██║╚██╔╝██║
#   __     ██║  ██║   ██║   ╚██████╔╝██║ ╚═╝ ██║    _
#  / _|    ╚═╝  ╚═╝   ╚═╝    ╚═════╝ ╚═╝     ╚═╝   | |
#  | |_ _ __ __ _ _ __ ___   _____      _____  _ __| | __
#  |  _| '__/ _` | '_ ` _ \ / _ \ \ /\ / / _ \| '__| |/ /
#  | | | | | (_| | | | | | |  __/\ v  v / (_) | |  |   <
#  |_| |_|  \__,_|_| |_| |_|\___| \_/\_/ \___/|_|  |_|\_\
#  https://github.com/lardemua/atom

# this yml file contains a set of commands to be run in batch.
# use jinja2 syntax for referencing variables

# Preprocessing will run only once before all experiments.
preprocessing:
  cmd: |
    ls /tmp

# Define batches to run
experiments:
{%- for e in experiments %}
  {% for run in runs %}
    {% set run_index = loop.index %}
    {% for fold in folds %}
      {{ e.name }}_run{{ '%03d' % run_index }}_fold{{ '%03d' % loop.index }}:
      cmd: |
        rosrun atom_calibration calibrate -json {{ dataset_path }}/dataset.json \ 
        -v  -ss {{ run }} \
        -nig {{ e.nig_value }} {{ e.nig_value }} \
        -ntfl "world:base_footprint"\
        -ntfv {{ e.ntfv_value }} {{ e.ntfv_value }}\
        -csf 'lambda x: int(x) in {{ fold[0] }}' \
        && \

        # front_left_camera to front_right_camera evaluation
        rosrun atom_evaluation rgb_to_rgb_evaluation \
        -train_json {{ dataset_path }}/atom_calibration.json \
        -test_json $ATOM_DATASETS/softbot/train/dataset.json \
        -ss front_left_camera -st front_right_camera \
        -sfr -sfrn /tmp/rgb_rgb_evaluation.csv

        # front_left_camera to lidar3d evaluation
        rosrun atom_evaluation lidar_to_rgb_evaluation -rs lidar3d -cs front_left_camera \
         -train_json {{ dataset_path }}/atom_calibration.json \
         -test_json  {{ dataset_path }}/atom_calibration.json \
         -csf 'lambda x: int(x) in {{ fold[1] }}' \
         -sfr -sfrn /tmp/lidar3d_rgb_front_left_evaluation.csv

        # front_right_camera to lidar3d evaluation
        rosrun atom_evaluation lidar_to_rgb_evaluation -rs lidar3d -cs front_right_camera \
         -train_json {{ dataset_path }}/atom_calibration.json \
         -test_json  {{ dataset_path }}/atom_calibration.json \
         -csf 'lambda x: int(x) in {{ fold[1] }}' \
         -sfr -sfrn /tmp/lidar3d_rgb_front_right_evaluation.csv

      files_to_collect:    
        - '{{ dataset_path }}/atom_calibration.json'
        - '{{ dataset_path }}/atom_calibration_params.yml'
        - '{{ dataset_path }}/command_line_args.yml'
        - '/tmp/rgb_rgb_evaluation.csv'
        - '/tmp/lidar3d_rgb_front_left_evaluation.csv'
        - '/tmp/lidar3d_rgb_front_right_evaluation.csv'
  {%- endfor %}
{%- endfor %}
# End the loop

I also had to fix a bug in here https://github.com/lardemua/atom/blob/e712d5d8604b3392cab72a602bc72ca1148c28be/atom_evaluation/scripts/lidar_to_rgb_evaluation#L40

It was crashing but now isn't anymore. Im yet to test if its fully functional.

I'm wondering if I should also include here comparissons with some other methods as I noticed there were some scripts already there such as the kalibr and opencv ones. Are these functional atm @miguelriemoliveira ?

brunofavs commented 2 months ago

I also didn't remove the runs loop because for instance, the rng seed relies on it, so I wasn't sure whether to keep it or not. I haven't tested the batch execution script though.

manuelgitgomes commented 2 months ago

Yes for the calibration I've always used the dataset.json ( the other isn't even there before the 1st calibration).

Please use dataset_corrected.json, as it is the output of dataset playback and has all the corrected labels. dataset.json has none of the work done in dataset playback.

But regarding the evaluation I'm not sure which one to use. I'm almost certain I should use the atom_calibration.json like it is here :

Use the atom_calibrated.json as the train dataset, and the dataset_corrected.json as the test dataset. The train dataset has the calibrated tfs. These tfs will be moved to the test dataset, and then the evaluation will be carried out in that dataset. You can get more info about this in:

https://github.com/lardemua/atom/blob/efb640f9bd9a2d19950f26d3e8bc70930517263e/atom_core/src/atom_core/dataset_io.py#L899-L957

Please correct this in your template.

I also didn't remove the runs loop because for instance, the rng seed relies on it, so I wasn't sure whether to keep it or not.

As you are using the -nig flag, I recommend you using it.

brunofavs commented 1 month ago

Some updates :

Note The batch in this issue was with only a few runs and folds and 3/4 experiments and with `max_nfev 1`

I got the batch execution running. There were a couple of errors with yaml indentation that caught me offguard.

After executing, I'm producing a folder tree like this :

.
├── nig_0.1-ntfv_0.1_run001_fold001
├── nig_0.1-ntfv_0.1_run001_fold002
├── nig_0.1-ntfv_0.1_run001_fold003
├── nig_0.1-ntfv_0.1_run001_fold004
├── nig_0.1-ntfv_0.1_run001_fold005
├── nig_0.1-ntfv_0.1_run002_fold001
├── nig_0.1-ntfv_0.1_run002_fold002
├── nig_0.1-ntfv_0.1_run002_fold003
├── nig_0.1-ntfv_0.1_run002_fold004
├── nig_0.1-ntfv_0.1_run002_fold005
├── nig_0.1_run001_fold001
├── nig_0.1_run001_fold002
├── nig_0.1_run001_fold003
├── nig_0.1_run001_fold004
├── nig_0.1_run001_fold005
├── nig_0.1_run002_fold001
├── nig_0.1_run002_fold002
├── nig_0.1_run002_fold003
├── nig_0.1_run002_fold004
├── nig_0.1_run002_fold005
├── perfect_sim_run001_fold001
├── perfect_sim_run001_fold002
├── perfect_sim_run001_fold003
├── perfect_sim_run001_fold004
├── perfect_sim_run001_fold005
├── perfect_sim_run002_fold001
├── perfect_sim_run002_fold002
├── perfect_sim_run002_fold003
├── perfect_sim_run002_fold004
└── perfect_sim_run002_fold005

31 directories

Where each folder has the following :

.
├── atom_calibration.json
├── atom_calibration_params.yml
├── command_line_args.yml
├── lidar3d_rgb_front_left_evaluation.csv
├── lidar3d_rgb_front_right_evaluation.csv
├── rgb_rgb_evaluation.csv
├── stderr.txt
└── stdout.txt

1 directory, 8 files

I'm also considering adding the output of the calibrate script by using the -sce flag. It might not be as useful but I can look directly at it and figure if the calibration is atleast acceptable. I'm more used to look at that table also, eases possible debuggings.

I also ran the processing script and it ran without issues and produced this structure :

.
├── nig_0.1
│   ├── lidar3d_rgb_front_left_evaluation.csv
│   ├── lidar3d_rgb_front_right_evaluation.csv
│   └── rgb_rgb_evaluation.csv
├── nig_0.1-ntfv_0.1
│   ├── lidar3d_rgb_front_left_evaluation.csv
│   ├── lidar3d_rgb_front_right_evaluation.csv
│   └── rgb_rgb_evaluation.csv
└── perfect_sim
    ├── lidar3d_rgb_front_left_evaluation.csv
    ├── lidar3d_rgb_front_right_evaluation.csv
    └── rgb_rgb_evaluation.csv

4 directories, 9 files

Until here everything seems to be going fine.

Are last week's bugs in the process results script still present?

I'm not quite sure how to interpret these processed outputs. The last row should be the average of all the averages of the different files or the average of the collumns in this specific file ? I'm aware these two things are slightly different due to having folds with different sizes.

,RMS (pix),X err (pix),Y err (pix)
0,100.73429,80.92446,48.66281
1,102.894,82.09286999999999,47.548140000000004
2,84.00488,65.57289,40.209469999999996
3,92.5801,75.10312,36.765339999999995
4,84.89291,67.56887,38.41995
5,78.59551,59.8197,38.78294
6,100.19724,80.84762,41.73102
7,84.35844999999999,66.22267000000001,42.9447
8,97.68077,80.16893,43.83863
9,106.658,88.944,46.5895

Now

Regardless of the processing, the raw data all seems to be ok. I'm now going to do a run without max_nfev and let it run for a while with only a few experiments to see if the results are good.

miguelriemoliveira commented 1 month ago

I'm not quite sure how to interpret these processed outputs. The last row should be the average of all the averages of the different files or the average of the columns in this specific file ? I'm aware these two things are slightly different due to having folds with different sizes.

My point is we can't have folds with different sizes. @manuelgitgomes does process results now deals with this?

Looks good.

manuelgitgomes commented 1 month ago

I'm not quite sure how to interpret these processed outputs. The last row should be the average of all the averages of the different files or the average of the collumns in this specific file ? I'm aware these two things are slightly different due to having folds with different sizes.

First of all, I have yet to merge this to main, so, my bad. You are still using the old process_results. This is now merged.

My point is we can't have folds with different sizes. @manuelgitgomes does process results now deals with this?

It does the average of all the averages, which I believe is correct.

miguelriemoliveira commented 1 month ago

Right. My recommendation is to do, like I did with the previous version, a test where you compute the average manually just ot make sure its correctly computed.

brunofavs commented 1 month ago

Right. My recommendation is to do, like I did with the previous version, a test where you compute the average manually just ot make sure its correctly computed.

Ok that works.

The preparation is done, The script runs without any problems and all the prior preparation is done so

I will create a new issue to discuss only the experiments themselves, as this one is already getting too long and escaping from the original topic

lardemua / atom

Prepare softbot batch of experiments #926

Bagfile recorded :

Note

Dataset collected :

I'm going to correct the labeling now.

Ignore this comment, it was a simple oversight

Just leaving it here in case I fall in the mistake somewhere down the line again

Update from the last few days :

Some updates :

Note The batch in this issue was with only a few runs and folds and 3/4 experiments and with `max_nfev 1`

Until here everything seems to be going fine.

Are last week's bugs in the process results script still present?

Now

lardemua / atom

Prepare softbot batch of experiments #926

Bagfile recorded :

Note

Dataset collected :

I'm going to correct the labeling now.

Ignore this comment, it was a simple oversight

Just leaving it here in case I fall in the mistake somewhere down the line again

Update from the last few days :

Some updates :

Note The batch in this issue was with only a few runs and folds and 3/4 experiments and with max_nfev 1

Until here everything seems to be going fine.

Are last week's bugs in the process results script still present?

Now

Note The batch in this issue was with only a few runs and folds and 3/4 experiments and with `max_nfev 1`