Updated images from qc.py shows many duplicate images with different image id/meta_data. We need to somehow filter out duplicates so that they don't dominate ICA components. Eg. collections 410, 1886 (and probably many more).
The easiest way may be to use automatically-generated NV meta-data, such as brain_coverage and perch_bad_voxels (and perc_voxels_outside, although this is missing in some images..)...but this might not work, since when I checked # of unique combinations of these three in NV metadata, there were only about 2000 of them, when there are ~9000 unique images.
:+1: Thanks for doing a bit of work up-front. I will try out the metadata approach (pre-download), and fall back to the full image comparison (post-download).
Updated images from qc.py shows many duplicate images with different image id/meta_data. We need to somehow filter out duplicates so that they don't dominate ICA components. Eg. collections 410, 1886 (and probably many more).
The easiest way may be to use automatically-generated NV meta-data, such as brain_coverage and perch_bad_voxels (and perc_voxels_outside, although this is missing in some images..)...but this might not work, since when I checked # of unique combinations of these three in NV metadata, there were only about 2000 of them, when there are ~9000 unique images.
@bcipolli, can you tackle this problem?