pleonard212 / pix-plot

A WebGL viewer for UMAP or TSNE-clustered images
MIT License
597 stars 139 forks source link

Associating filenames and clusters #275

Open Giuliagiorgi opened 1 year ago

Giuliagiorgi commented 1 year ago

Hi everyone and thank you for this amazing tool!

I'm trying to clusterize around 9k images through PixPlot to exclude non-pertinent clusters of images. To do that, I wonder if there is a possibility to have a file with the filenames of the images associated with each cluster without using the lazo option (which would be less precise).

In the 'hotspot' JSON file, the images associated with each cluster have an integer number and then there is the filename of the centroid. Something along the line:

{
    "images": [
        20,
        23,
        118,
        177,
        221,
        222,
        223,
        224,
        ...
    ],
    "img": "B_E33VKIOek.jpg",
    "label": "Cluster 2"
}

This makes it difficult to filter out the images in the original folder, as I cannot link them to the one in the cluster. The output I'm looking for instead is something like:

{
    "images": [
        B_E33VKIOek.jpg,
        B_aaaaaaa.jpg,
        B_bbbbbb.jpg,
        B_fofofo.jpg,
        ...
    ],
    "img": "B_E33VKIOek.jpg",
    "label": "Cluster 2"
}

Am I missing something and this output already exists? Or how can I link the integer numbers in the 'hotspot' JSON file to the filenames of the images?

Thank you very much, Giulia

duhaime commented 1 year ago

Hey Giulia! Those integers in the images key are just the index positions of images/files within imagelist in the output. If you wrote a little script it should be easy to convert the index positions to the filenames. I hope that helps!