google-deepmind / tapnet

Tracking Any Point (TAP)
https://deepmind-tapir.github.io/blogpost.html
Apache License 2.0
1.32k stars 127 forks source link

Project Aria Sequence Names Have Changed #108

Closed relh closed 4 months ago

relh commented 4 months ago

To download the Project Aria digital twin data now you need to get a CDN JSON from a different address than the TapVid 3D README links to (see https://github.com/facebookresearch/projectaria_tools/issues/112). I would PR an update to the README to point to this new page but there's still a bug.

The new sequences from https://explorer.projectaria.com/ have changed the sequence names from the annotations for TapVid 3D.

If you run this command: python3 -m tapnet.tapvid3d.annotation_generation.generate_adt --adt_base_path /mnt/sda/tapvid/tapnet/tapvid3d_dataset/raw_adt/

You see an error like this: [AriaDigitalTwinDataPathsProvider][ERROR]: sequence path does not exist: /mnt/sda/tapvid/tapnet/tapvid3d_dataset/raw_adt/Apartment_release_clean_seq131

This is because the new project aria sequences have names like Apartment_release_clean_seq131_M1292 where they have this additional M1292 tacked on. So the video folders no longer line up with the annotation numpy names:

ls ../adt/tmp/Apartment_release_clean_seq131_*
../adt/tmp/Apartment_release_clean_seq131_0.npz  ../adt/tmp/Apartment_release_clean_seq131_2.npz  ../adt/tmp/Apartment_release_clean_seq131_4.npz  ../adt/tmp/Apartment_release_clean_seq131_6.npz
../adt/tmp/Apartment_release_clean_seq131_1.npz  ../adt/tmp/Apartment_release_clean_seq131_3.npz  ../adt/tmp/Apartment_release_clean_seq131_5.npz  ../adt/tmp/Apartment_release_clean_seq131_7.npz

This all fails in the adt_utils file, where its loading sequence names that don't exist. If I fix this I'll PR it, it might be as simple as pre-processing the ARIA data back to the name convention that the annotations use. https://github.com/google-deepmind/tapnet/blob/9ec6fa6df84094073d92ce158470519ce10240b3/tapnet/tapvid3d/annotation_generation/adt_utils.py#L135

ignacio-rocco commented 4 months ago

Hi, it seems both the data and the API have changed. For now, could you use the old ADT json file together with pip install projectaria-tools'[all]'==1.5.1a1 and download with eg: adt_benchmark_dataset_downloader --cdn_file ariajson.json --output_folder ./adt_raw --sequence_names Apartment_release_clean_seq131 -d 0 1 2 3 4 5 6 7 8?

Thanks for reporting this.

ignacio-rocco commented 4 months ago

After a bit more exploration, it seems that indeed, the new files work as well as the old ones, module the small issue with the paths. Could you confirm if the new files are also working for you after the change from https://github.com/relh/tapnet/commit/b24736ad56b5e0fbd94e69151dd8f4244240f930 ?

relh commented 4 months ago

Yeah! I just had to finish downloading the depth maps and rest of the annotations first to fully test. I'm running this command:

python3 -m tapnet.tapvid3d.annotation_generation.generate_adt --adt_base_path /mnt/sda/tapvid/tapnet/tapvid3d_dataset/raw_adt/

I see a few errors/warningswarning: [RecordReaderInterface][ERROR]: Tag 'metadata' was not found in the VRS file tags Failed to parse eye gaze vergence file: Extra column "yaw_rads_cpf" in header of file "/mnt/sda/tapvid/tapnet/tapvid3d_dataset/raw_adt/Apartment_release_clean_seq131_M1292/eyegaze.csv".

I've run into this error in an assert comparing the hashes, I'm debugging it now incase it's just because the string/name changed:

  File "/mnt/sda/tapvid/tapnet/tapnet/tapvid3d/annotation_generation/generate_adt.py", line 116, in main
    generate_adt_npz(_ADT_BASE_PATH.value, tmp_adt_dir, _OUTPUT_DIR.value)
  File "/mnt/sda/tapvid/tapnet/tapnet/tapvid3d/annotation_generation/generate_adt.py", line 95, in generate_adt_npz
    adt_utils.process_vid(
  File "/mnt/sda/tapvid/tapnet/tapnet/tapvid3d/annotation_generation/adt_utils.py", line 439, in process_vid
    assert trajectories_hash == in_npz["tracks_XYZ_hash"]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError
Uncaught exception. Entering post mortem debugging
Running 'cont' or 'step' will restart the program
> /mnt/sda/tapvid/tapnet/tapnet/tapvid3d/annotation_generation/adt_utils.py(439)process_vid()
    438     ).hexdigest()
--> 439     assert trajectories_hash == in_npz["tracks_XYZ_hash"]
    440 

ipdb> trajectories_hash
'93d1ca618dd8a2c91ae94f8e6852569c'
ipdb> in_npz["tracks_XYZ_hash"]
array('a2f0766185dc875a7c661dce1713594a', dtype='<U32')
relh commented 4 months ago
visibilities.mean()
0.6467313131313132

in_npz['visibility_mean']
array(0.64673131)

The hashes aren't the same but the visibilities are, so it might be the same data. Currently still checking.

relh commented 4 months ago
[MpsDataPathsProvider][WARNING]: Hand tracking folder (/mnt/sda/tapvid/tapnet/tapvid3d_dataset/raw_adt/Apartment_release_clean_seq131_M1292/mps/hand_tracking) does not exist in MPS root folder, not loading wrist and palm poses.                                                                                           14:46:08 [42/9583]
[MultiRecordFileReader][DEBUG]: Opened file '/mnt/sda/tapvid/tapnet/tapvid3d_dataset/raw_adt/Apartment_release_clean_seq131_M1292/video.vrs' and assigned to reader #0
[VrsDataProvider][INFO]: streamId 211-1/camera-et activated                                                                                                                                                                                                                                                                                     [VrsDataProvider][INFO]: streamId 214-1/camera-rgb activated                                                                                                                                                                                                                                                                                    [VrsDataProvider][INFO]: streamId 247-1/baro0 activated
[VrsDataProvider][WARNING]: Unsupported TimeSync mode: APP, ignoring.
[VrsDataProvider][INFO]: Timecode stream found: 285-2
[VrsDataProvider][INFO]: streamId 1201-1/camera-slam-left activated
[VrsDataProvider][INFO]: streamId 1201-2/camera-slam-right activated
[VrsDataProvider][INFO]: streamId 1202-1/imu-right activated
[VrsDataProvider][INFO]: streamId 1202-2/imu-left activated
[VrsDataProvider][INFO]: streamId 1203-1/mag0 activated
[AriaDigitalTwinDataProvider][INFO]: loading instance info from json file /mnt/sda/tapvid/tapnet/tapvid3d_dataset/raw_adt/Apartment_release_clean_seq131_M1292/instances.json                                                                                                                                                                   
Loaded #closed loop trajectory poses records: 2880                                                                                                                                                                                                                                                                                              
[MultiRecordFileReader][DEBUG]: Opened file '/mnt/sda/tapvid/tapnet/tapvid3d_dataset/raw_adt/Apartment_release_clean_seq131_M1292/segmentations.vrs' and assigned to reader #0
[StreamIdLabelMapper][WARNING]: stream id 400-1 not found in Aria Device Model. You will not be able to get the label of this stream. 
[VrsDataProvider][INFO]: streamId 400-1/NA activated
[StreamIdLabelMapper][WARNING]: stream id 400-2 not found in Aria Device Model. You will not be able to get the label of this stream. 
[VrsDataProvider][INFO]: streamId 400-2/NA activated
[StreamIdLabelMapper][WARNING]: stream id 400-3 not found in Aria Device Model. You will not be able to get the label of this stream. 
[VrsDataProvider][INFO]: streamId 400-3/NA activated
[VrsDataProvider][WARNING]: VRS file does not contain calib_json field in VRS tags.                                                                                     
[RecordReaderInterface][ERROR]: Tag 'metadata' was not found in the VRS file tags
[MultiRecordFileReader][DEBUG]: Opened file '/mnt/sda/tapvid/tapnet/tapvid3d_dataset/raw_adt/Apartment_release_clean_seq131_M1292/depth_images.vrs' and assigned to reader #0
[StreamIdLabelMapper][WARNING]: stream id 345-1 not found in Aria Device Model. You will not be able to get the label of this stream. 
[VrsDataProvider][INFO]: streamId 345-1/NA activated                                                                                                                    
[StreamIdLabelMapper][WARNING]: stream id 345-2 not found in Aria Device Model. You will not be able to get the label of this stream. 
[VrsDataProvider][INFO]: streamId 345-2/NA activated
[StreamIdLabelMapper][WARNING]: stream id 345-3 not found in Aria Device Model. You will not be able to get the label of this stream. 
[VrsDataProvider][INFO]: streamId 345-3/NA activated                      
[VrsDataProvider][WARNING]: VRS file does not contain calib_json field in VRS tags.                                                                                     
[RecordReaderInterface][ERROR]: Tag 'metadata' was not found in the VRS file tags
[RecordFileReader][DEBUG]: Reading TagsRecord for RGB Camera Class #1                                                                                                   
[TagsRecord][DEBUG]: Read 7 VRS tags and 5 user tags for RGB Camera Class #1
[RecordFileReader][DEBUG]: Reading TagsRecord for Camera Data (SLAM) #1
[TagsRecord][DEBUG]: Read 7 VRS tags and 5 user tags for Camera Data (SLAM) #1
[RecordFileReader][DEBUG]: Reading TagsRecord for Camera Data (SLAM) #2
[TagsRecord][DEBUG]: Read 7 VRS tags and 5 user tags for Camera Data (SLAM) #2
[RecordFileReader][DEBUG]: Reading TagsRecord for IMU Data (SLAM) #1                                                                                                    
[TagsRecord][DEBUG]: Read 7 VRS tags and 5 user tags for IMU Data (SLAM) #1
[RecordFileReader][DEBUG]: Deleted 4 TagsRecords from the index. 
[MultiRecordFileReader][DEBUG]: Opened file '/mnt/sda/tapvid/tapnet/tapvid3d_dataset/raw_adt/Apartment_release_clean_seq131_M1292/synthetic_video.vrs' and assigned to reader #0
[VrsDataProvider][INFO]: streamId 214-1/camera-rgb activated
[VrsDataProvider][INFO]: streamId 1201-1/camera-slam-left activated
[VrsDataProvider][INFO]: streamId 1201-2/camera-slam-right activated
[VrsDataProvider][INFO]: streamId 1202-1/imu-right activated
[DeviceCadExtrinsics][WARNING]: No CAD available for simulated device
[RecordReaderInterface][ERROR]: Tag 'metadata' was not found in the VRS file tags
[AriaDigitalTwinDataProvider][INFO]: skip loading skeletonMetaDataFilePath because the data path is empty
[AriaDigitalTwinDataProvider][INFO]: skip loading skeletonsFilePaths because the data path is empty
Failed to parse eye gaze vergence file: Extra column "yaw_rads_cpf" in header of file "/mnt/sda/tapvid/tapnet/tapvid3d_dataset/raw_adt/Apartment_release_clean_seq131_M1292/eyegaze.csv".
Loaded #EyeGazes: 3526

It seems like there might be some changes in the data/streams? I'm still not sure. Going to poke around a little bit more. I tried changing up the endianness and hashing again but couldn't get them to agree.

ignacio-rocco commented 4 months ago

The hashing functionality was added a bit last minute and it's not fully tested. If you could disable this for the moment and record the examples where the hashing differs, and then share those files with me, I can debug further. Thanks!

relh commented 4 months ago
a58ad5052af6f55d49c1cde0d824a91                                                                                                                                                                                           | 1/9 [00:38<05:06, 38.30s/it]
a6fb9c60635f73a788598569cc69d2a4

61a48b1fa57164c00d10e58eafe9947a
61a48b1fa57164c00d10e58eafe9947a

1daac6d820182d48d9eb6d369be7909f
3cf6f1e47c95d8e0bd782d24e31d9c72

f13b4c0b5cf8aacbee441345d0d058da
f13b4c0b5cf8aacbee441345d0d058da

be5fd485a3c7409d07863bf6e88efce6
be5fd485a3c7409d07863bf6e88efce6

67fe77ec6fbb0376270e56cc59d0d0e8
67fe77ec6fbb0376270e56cc59d0d0e8

 78%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                               | 7/9 [04:05<01:11, 35.80s/it]

It seems a mix of like 50% or more of the hashes agreeing and remaining not. I'm still generating the annotations as had to add a try/continue because 2-3 of the 200+ ADT depth maps/segmentations couldn't be downloaded from the explorer project aria page.

Once it's done I'll check if the non-hashing ones still visually look good.. but probably something weird at play? Given its like 50/50 I wonder if it's something like rounding or who knows what.

I tried hashing a bunch of other transforms just incase but the original one in the adt_utils is the only one that's at least 50/50 correct in hashes:

    scales = [1, 10, 100, 1000, 10000]
    signs = [-1, 1]
    data_types = [np.int32, np.int64]
    use_absolute = [True, False]
    endianesses = ['<', '>']
    axes_permutations = itertools.permutations([0, 1, 2])

    for scale in scales:
        for data_type in data_types:
            for endian in endianesses:
                for abs_val in use_absolute:
                    for signs_combination in itertools.product(signs, repeat=3):
                        for permutation in axes_permutations:
ignacio-rocco commented 4 months ago

Hi, I was able to redownload the whole ADT raw data with aria_dataset_downloader --cdn_file Aria_Digital_Twin_1720774989.json -o ./adt_raw -d 0 6 7 8 without issues.

I'm regenerating the all the npz's with these freshly downloaded ADT dataset, but so far all hashes are matching, so I couldn't reproduce your issue. Could you please share some of the npz's you generated which have non-matching hashes so I can take a closer look?

Thanks, Ignacio

ignacio-rocco commented 4 months ago

Hi @relh, I've updated the codebase and base npz files to handle the ADTv2 version. This should be fixed now. Sorry for this issue and thanks for your patience.

relh commented 4 months ago

Thanks! I think the hashing issue was unrelated. It is probably a versioning issue on my side.

I've recreated my python env from scratch and followed the instructions here: https://github.com/google-deepmind/tapnet/tree/main/tapnet/tapvid3d

I've then run the generate_all.sh script.

All of my visibilities match, but only about 50% of my hashes match. I don't know what the problem is but if the trajectories look bad I'll update this or make a new issue as a sign post for other people. Thanks!

I can send you a .npz file if you want but luckily it seems to be a me problem and not a v2 of ADT integration problem :).

ignacio-rocco commented 4 months ago

Hey Richard, please do send me a few npz with non-matching hashes and I'll take a look. If you can post a link here to GDrive/Dropbox/other that'd be great.

Thanks.

relh commented 4 months ago

Sure thing! Here's one:

https://drive.google.com/file/d/1vD4NLJayJuwME67t5aTPjycocYBT28kJ/view?usp=sharing

-- hashes ---                                                                                                                                                                                                                                                                                                             | 0/9 [00:00<?, ?it/s]
6d67b236c8ae5e87cbc05291b08792cd
f424c88b10317760c5819f00d813f227
------

This was my print-out for it. One hash is the computed, other is from the npz

Here is another one incase having 2 helps: https://drive.google.com/file/d/1LF-UDFxi3zadBrKRMQMFIJNFCL489-PA/view?usp=sharing

ignacio-rocco commented 4 months ago

Ok, that was helpful. It seems that computation errors produce some +-1 differences when rounding the tracks to ints in millimeters. I'll switch to using means for both tracks and visibilities shortly.

I've already uploaded new rc3 npz files with this field, and the code update will follow shortly, in case you want to double check the files you generated.

As an extra note, I also realized that the sequence "Apartment_release_work_skeleton_seq136" had some corrupt depth/segmentation files in ADT-v1, which resulted in new npz files for that sequence only. You might need to regenerate these.

relh commented 4 months ago

Wow good bug hunting! Thanks for investigating.

I also found maybe 1 sequence that broke the visibilities assert, it might be enough to change the threshold from 1e-5 to 1e-4, but I didn't check.