Jordan-Pierce / CoralNet-Toolbox

Tools useful for interacting with CoralNet and other CPCe-related downstream tasks

https://coralnet.ucsd.edu/

Other

17 stars 1 forks source link

Feature Request: Ability to Delete and Regenerate Patches CSV in CoralNet Toolbox #8

Closed m-h-williams closed 1 month ago

m-h-williams commented 2 months ago

Hello Jordan,

I have a suggestion for a useful addition to the CoralNet Toolbox. It would be great to have a feature that allows users to delete patches created using the Patch Extractor Tool and subsequently generate a new patches CSV file. This new CSV file could then be used to train a classifier.

The ability to extract patches from an annotation file is incredibly helpful for visualizing the patches associated with specific labels. However, during this process with my source, I noticed some erroneous patches that I wanted to remove to clean up the annotations. While I can manually delete these patches from the folder on my computer, this doesn't update the annotation file or create a new patches file. I find that this hinders the ability to train a classifier with the cleaned data.

Adding this feature could streamline the process and enhance the utility of the toolbox for users like myself who need to refine their data before training classifiers.

Perhaps there is a work around with the current tools that I am not thinking of. What do you think?

Megan

Jordan-Pierce commented 2 months ago

Hi @m-h-williams,

Thanks for the suggestion; I agree that this would be a great addition. The Patch Extraction tool is an EXE, and I don't have access to the underlying code; however, it wouldn't be too difficult to whip something up in that replicates the utility. I also think it's about time to move towards a customizable patch annotation tool instead of solely relying on the former.

Before jumping into the code, in your mind, what features should be migrated / removed from the existing Patch Extraction tool, and also, what other features would you like to see in the new tool? Having a bigger picture of what is necessary / desired will help with scoping out the work. My thoughts are:

Set patch size in pixels
Set label name, color code (RGB); exportable labelset as a JSON so it can be re-imported back in the future
Create, move, delete patch annotations
Load image(s) panel (vertical), scrollable
Export annotations, import existing annotations (auto load on the respective image if already in the image panel)
Autosave patches.csv every N seconds as temp file, just in case a crash occurs
(maybe) have a table (dataframe) showing the annotations made, and have it filterable; when clicked, maybe auto go to the image / annotations (not sure how much this would be used)

What else would you like to see?

CoralNet-Toolbox_Patch_Annotator drawio

m-h-williams commented 2 months ago

Hi @Jordan-Pierce ,

Thanks for considering my suggestion and for your detailed thoughts on the new tool! The features you've outlined sound great and would significantly enhance the usability of the Patch Extraction tool.

As for additional features, here are my thoughts:

Patch Size and Label Customization: The ability to set patch sizes and customize labels with names and color codes (RGB) sounds perfect. The option to export and import label sets as JSON files would also be very useful for consistency across projects.

* I have a couple of questions here: Would the resolution of the image be set by the user, or would that information be gathered from the image data? Also, are you thinking the color code would be selected from a color wheel, or automatically assigned when a new label is created? I think either option would be great for visualizing different annotations. **Patch Annotation Management**: Including the functionality to create, move, and delete patch annotations is essential, as it gives users the flexibility to refine their datasets directly within the tool. * One suggestion here: it would be helpful to have the ability to toggle annotations for specific labels on and off. This feature would allow users to clear the image of annotation boxes once they are done creating image patches for a particular label, making it easier to see the entire image when working on a new label. The excess of boxes overlaid on an image can sometimes overwhelm the view. **Image Panel and Annotations Management**: Having a scrollable image panel and the ability to load multiple images is important for batch processing. The idea of auto-loading existing annotations and autosaving the patches.csv file as a temp file every few seconds is excellent for preventing data loss. **Annotations Table and Filtering**: The idea of having a table showing the annotations made, with the ability to filter and click to navigate to specific images/annotations, could be very beneficial. As for the auto-navigation to the image/annotations upon clicking in the table, it could be particularly useful for quickly reviewing specific patches or labels, especially if there are many images to sift through. Overall, these features additions are very exciting and would significantly streamline the workflow and improve data management. If there's anything specific you'd like me to focus on or clarify further, please let me know!

Jordan-Pierce commented 2 months ago

I'm thinking:

Image resolution would be native, and it would be read from the image, user wouldn't need to set anything (the image also won't be down-sampled even if large); in the image viewer, you'd be able to zoom in.
The patch size will be something you set based on your needs and image resolution; whatever patch size is set to is what the patch will be when you make a new patch; you can have patches of different sizes in the same image / project if you change the patch size; just like the other tool, you'll see how big the patch.
Color code will be set by default (some random color), but you can set it using a pop-up color wheel before pressing Okay.
We can have it so that each label has an eye button so you can toggle it on / off, and then a master toggle on / off to turn all on or off.
Annotation table for filtering will probably be the last addition, but good to hear that it could be useful! It'd be a new window (all the others will likely be docked and locked), and you'd be able to select a row, and then it would auto go to the image, and highlight the patch; it'd also be cool if you select multiple, and change the labels there (if needed) or even delete, change patch size, etc., This will require some thought so that's why it'll be last.
Exporting the annotations (as a CSV) will export all annotations currently in the project, CoralNet-format; we could have it in the future where given the annotation table, you can select a subset and export just those.
Importing annotations (CSV files) can be done multiple times, so you can import a file, they load, and then import another csv file, and they load as well, etc.

(for posterity)

CoralNet-Toolbox_Patch_Annotator drawio (1)

m-h-williams commented 2 months ago

This sounds fantastic! I'm really excited about these features and the flexibility they will bring to the tool.

It's great to hear that the image resolution will be read from the native image, allowing for zooming without down-sampling. The flexibility to set patch sizes based on specific needs and resolutions, and the option to have patches of different sizes in the same project, will be incredibly useful.

The toggle feature for turning labels on and off, including a master toggle, is a thoughtful addition. It will definitely help in managing annotations more effectively without overwhelming the view. I'm also looking forward to the annotation table for filtering and editing.

The plans for exporting and importing annotations sound comprehensive and will be a big help for managing datasets. I appreciate the consideration for future updates, like exporting subsets of annotations based on the table view.

Thank you for taking these suggestions into account and for the thoughtful planning. I'm excited to see these features come to life! Please feel free to reach out if there's anything I can do to help or support the development process.

Jordan-Pierce commented 2 months ago

@m-h-williams it's up if you wanna give it a go!

just remember to update:

# cmd

conda activate your_env

git fetch
git pull

python install.py

python toolbox.py

Please log any issues you have in this thread!

m-h-williams commented 2 months ago

@Jordan-Pierce Thank you so much for the hard work you've put into the beta version of the new annotation tool—it's really coming together nicely!

I had a few thoughts and suggestions as I’ve been exploring it:

When clicking on a label in the lower label window, could it highlight the corresponding patch(es) with that label? I think this would make it easier to identify and manage specific patches.
I noticed we can delete a label, but I was wondering if there's a way to delete a single image patch for a label instead of all the patches for that label? I believe you mentioned this feature might be added later, and I think it would be really helpful. The upcoming addition of the ability to move patches will be a great enhancement as well!
It would also be helpful if the outline of a selected image patch were a brighter color. Right now, it matches the box color, making it a bit difficult to see.
The short code in the label window seems to be getting cut off after four characters, which makes it hard to read the full label.
It would be nice to have toggle on/off buttons for each label, along with a master toggle button. This way, we could start with all labels off and then turn them on one at a time to better see which labels need to be changed. Perhaps this was a feature you were thinking of adding later on as well.
When using the "Edit Label" function, it currently changes the label for all image patches with the previous label. It would be great if we could select an individual image patch, click "Edit Label," and change the label just for that specific patch. Additionally, after changing a label, hovering over the new label still shows the old label instead of the updated label (as shown in the screenshot below).
When using the "Add Label" button, it would be helpful if we could select where the image patch is placed. Currently, it seems to create the patch in the middle of the image, which might not be the ideal location.
Lastly, the add, edit, and delete label buttons above the label window are overlapping each other, as seen in the screenshot below.

Thanks again for all your efforts—I'm really excited to see these features evolve. Please let me know if there’s anything I can do to support your work!

Jordan-Pierce commented 2 months ago

@m-h-williams instructions! My bad: here they are, will add to repo next round:

CoralNet Toolbox Instructions

Overview

The CoralNet Toolbox is a Python application built using PyQt5 for image annotation and analysis. This guide provides instructions on how to use the application, including key functionalities and hotkeys.

Main Window

The main window consists of several components:

Menu Bar: Contains import, export, and other actions.
Tool Bar: Contains tools for selection and annotation.
Annotation Window: Displays the image and annotations.
Label Window: Lists and manages labels.
Thumbnail Window: Displays thumbnails of imported images.
Confidence Window: Displays cropped images and confidence charts.

Menu Bar Actions

Import:
- Import Images: Load image files.
- Import Labels (JSON): Load label data from a JSON file.
- Import Annotations (JSON): Load annotation data from a JSON file.
- Import Annotations (CoralNet): Load annotation data from a CoralNet CSV file.
Export:
- Export Labels (JSON): Save label data to a JSON file.
- Export Annotations (JSON): Save annotation data to a JSON file.
- Export Annotations (CoralNet): Save annotation data to a CoralNet CSV file.
CoralNet:
- Authenticate: Authenticate with CoralNet.
- Upload: Upload data to CoralNet.
- Download: Download data from CoralNet.
- Model API: Access CoralNet model API.
Machine Learning:
- Create Dataset: Create a dataset for machine learning.
- Train Model: Train a machine learning model.
- Make Predictions: Make predictions using a trained model.

Tool Bar

Select Tool: Select and move annotations.
Annotate Tool: Add new annotations.
Polygon Tool: Draw polygon annotations.

Annotation Window

Zoom: Use the mouse wheel to zoom in and out.
Pan: Hold the right mouse button to pan the image.
Add Annotation: Click with the left mouse button while using the annotate tool.
Select Annotation: Click on an annotation while using the select tool.

Hotkeys

Ctrl + Z: Undo the last action. - Don't use
Ctrl + Y: Redo the last action. - Don't use
Delete: Delete the selected annotation.
Ctrl + W/A/S/D: Navigate through labels.
Ctrl + Mouse Wheel: Adjust annotation size.

Label Window

Add Label: Click the "Add Label" button to add a new label.
Edit Label: Click the "Edit Label" button to edit the selected label.
Delete Label: Click the "Delete Label" button to delete the selected label.

Thumbnail Window

Load Image: Click on a thumbnail to load the image in the annotation window.
Delete Image: Right-click on a thumbnail and select "Delete" to remove the image.

Confidence Window

Display Cropped Image: Shows the cropped image of the selected annotation.
Confidence Chart: Displays a bar chart with confidence scores.

Additional Tips

Annotation Sampling: Use the "Sample Annotations" action in the menu bar to automatically generate annotations.
Transparency Control: Adjust the transparency slider in the status bar to change annotation transparency.

Jordan-Pierce commented 2 months ago

CoralNet-Toolbox QT

Jordan-Pierce commented 2 months ago

When clicking on a label in the lower label window, could it highlight the corresponding patch(es) with that label? I think this would make it easier to identify and manage specific patches.

Possibly; I would need to change how the annotations are highlighted in the annotation window for this. I might provide a separate button or action for selecting all of one label in the scene. But before that, I need to change to the ability to select more than one annotation at a time.

I noticed we can delete a label, but I was wondering if there's a way to delete a single image patch for a label instead of all the patches for that label? I believe you mentioned this feature might be added later, and I think it would be really helpful. The upcoming addition of the ability to move patches will be a great enhancement as well!

Yes, choose the Select tool instead of the Annotate tool (see cursor icon); when in this tool, you can select individual patches and delete using the delete keyboard key. Deleting the label from the label window will remove all annotations in the entire project for that label. Additionally, if you want to map one label to another (merge) you can use the edit label button and input the short and long label code to the desired, and it will change them to the desired label.
Can add ability to move, will do!

It would also be helpful if the outline of a selected image patch were a brighter color. Right now, it matches the box color, making it a bit difficult to see.

Will do

The short code in the label window seems to be getting cut off after four characters, which makes it hard to read the full label.

This is partially due to the resolution problem (see below); but I meant to add a strict number of characters for the short label code (like, 8?) so users have to use a short label; hovering will show the long label. Will fix.

It would be nice to have toggle on/off buttons for each label, along with a master toggle button. This way, we could start with all labels off and then turn them on one at a time to better see which labels need to be changed. Perhaps this was a feature you were thinking of adding later on as well.

Agreed; this is partially available now by selecting each label in the label when using the selector tool, or hot keys (cntrl + wasd), and then changing the transparency; then do this for each label not in use. I will add a radio box for "apply to all".

When using the "Edit Label" function, it currently changes the label for all image patches with the previous label.

Yes intended.

It would be great if we could select an individual image patch, click "Edit Label," and change the label just for that specific patch.
See comment above about using the selector tool to select an individual annotation and changing the label for just it using the tool on the label in the label window / or with hot keys (cntrl + wasd)

Additionally, after changing a label, hovering over the new label still shows the old label instead of the updated label (as shown in the screenshot below).

Will fix.

When using the "Add Label" button, it would be helpful if we could select where the image patch is placed. Currently, it seems to create the patch in the middle of the image, which might not be the ideal location.

? I'm not sure what you mean by this one; if you add a label, it will create a label in the label window (this is your labelset) that you can then choose to then create annotations in the annotation window for that class category. Annotations are made using the cursor. Perhaps git fetch and git pull again?

Lastly, the add, edit, and delete label buttons above the label window are overlapping each other, as seen in the screenshot below.

Resolution and different computer monitor screens 😣 will fix, sorry about that one.

Jordan-Pierce commented 2 months ago

Regarding resolution, can confirm that it needs to update scaling dynamically; these are my settings

If I choose anything less than 150 on a larger monitor it displays as intended. Will have a better fix in the future

Jordan-Pierce commented 2 months ago

@m-h-williams another update if you wanna give it a try; not all has been addressed, but added new stuff and fixed others.

m-h-williams commented 2 months ago

@Jordan-Pierce This is already a fantastic improvement! The confidence window is incredibly helpful for quickly viewing what an image patch looks like. The different colors when a patch is highlighted make it much easier to manage annotations, and the ability to move patches around works seamlessly—this really enhances the annotation process.

The overlap of the edit tabs and the cutoff labels was indeed a scaling issue on my computer. I have a 4K screen, but even without adjusting the scaling, the labels are now much easier to read. Thank you for addressing that!

I have a few more suggestions that I think could further enhance the tool:

Importing Label Sets: Could we add an option to import a label set so that all the label options are pre-loaded in the label window? This would help reduce the chance of typos when adding new annotations, ensuring consistency across different images.
Hotkeys for Scrolling Through Image Patches: The hotkeys for scrolling through labels are great! It would be awesome if we could also use hot keys to scroll through the image patches of a selected label. My thought is that when a label is selected, you could use the confidence window to quickly check each annotation, delete it if it's incorrect, and then automatically move to the next image patch with the same label. Or, if the label is correct, you could use a hot key to move to the next patch. I think this would make cleaning annotations much faster than selecting each image patch one at a time.
Selecting and Deleting Multiple Patches: Could we add a feature that allows users to select multiple patches at once—perhaps by dragging the mouse over them—and then delete or rename the group? For example, in the screenshot below, algae has been incorrectly labeled as pebble. It would be really helpful if we could select these patches in bulk and delete/correct them all at once.

Thank you again for your hard work on these updates! These improvements have already made a big difference.

Jordan-Pierce commented 2 months ago

Importing Label Sets: Could we add an option to import a label set so that all the label options are pre-loaded in the label window? This would help reduce the chance of typos when adding new annotations, ensuring consistency across different images.

See import -> import Labels (JSON); is this what you mean? You can spend time creating the labels, then export them (export -> export Labels (JSON)) and then re-import them the next session. Eventually I'll add a import Labels (CoralNet); in the far future I'll probably define a "Project" so all things are saved there and you get back to the same state just by opening an existing Project

Hotkeys for Scrolling Through Image Patches: The hotkeys for scrolling through labels are great! It would be awesome if we could also use hot keys to scroll through the image patches of a selected label. My thought is that when a label is selected, you could use the confidence window to quickly check each annotation, delete it if it's incorrect, and then automatically move to the next image patch with the same label. Or, if the label is correct, you could use a hot key to move to the next patch. I think this would make cleaning annotations much faster than selecting each image patch one at a time.

100% agree, that's definitely going to be added. cntrl + wasd for labels, and then cntrl left-right for annotations, and cntrl up-down for adjusting the label within confidence window (you can't see it, but I'll allow the top-5 predictions to be shown with their scores, and then you can choose the different predictions to overwrite the model choice). I also need to add a hotkey for going through the thumbnails...

Selecting and Deleting Multiple Patches: Could we add a feature that allows users to select multiple patches at once—perhaps by dragging the mouse over them—and then delete or rename the group? For example, in the screenshot below, algae has been incorrectly labeled as pebble. It would be really helpful if we could select these patches in bulk and delete/correct them all at once.

100% agree, working on finishing the built-in machine learning, but afterwards I'll look at multiselect using the selector tool (cntrl + left click + drag, shift click, etc.); I should have started with this, but I didn't so now I have to go back and change some stuff.

Will let you know when ML is out.

m-h-williams commented 2 months ago

Thanks, @Jordan-Pierce ! The import/export feature for label sets is what I was looking for. The future plans you mentioned, like importing labels from CoralNet and defining a "Project" for saving everything together, sound like great improvements that will streamline the workflow even further.

Very excited to hear that hotkeys for scrolling through image patches and adjusting labels within the confidence window are on the way. And having hotkeys for navigating through thumbnails will definitely speed up the process — great idea!

The upcoming multi-select feature will be a great improvement, especially when dealing with large datasets. I totally understand that it requires some reworking of the existing setup, but I’m really looking forward to it when it’s ready.

One quick question: When creating new labels, how do I ensure I’m generating the correct unique ID for them? I noticed the IDs in the JSON file are quite specific, and I want to make sure I’m doing it correctly.

Thanks again, I'm really excited to see these new features roll out!

Jordan-Pierce commented 2 months ago

One quick question: When creating new labels, how do I ensure I’m generating the correct unique ID for them? I noticed the IDs in the JSON file are quite specific, and I want to make sure I’m doing it correctly.

So you shouldn't really need to worry about the label IDs, that's more for me in the background. The long string of characters is a UUID, and it's just meant to provide a unique value for a label (again, more for me). The label ID can be simple / changed (for example, 'Review' is kept as -1). If you wanted to alter them, you could, by going to the annotation JSON file and label JSON file, and just doing a find and replace all accordingly. But you shouldn't really need to do that.
If you create a label by adding a new one, or importing from CoralNet CSV, then a label ID will be created for you. Whenever you're done with the project for the day, you save the annotations and / or labels as JSON files, so when you re-import them, everything will be exactly the same. CoralNet CSV files do not contain this information, so each time you export / re-import some of the information is lost (e.g., color). So, there's some work to be done here wrt how it's handled (I'm starting on the CoralNet tools later this week).
If you have an annotation that you're importing and it's label (short AND long code) are already present in the project, it will just be added / merged to the existing / matching label. So, to answer your question, for the front-end, it's all about the short and long code, not the IDs.

Jordan-Pierce commented 2 months ago

@m-h-williams machine learning has been added.; might be useful to go over it via a call but briefly:

create a dataset (you need enough annotations of each label in enough images for it to switch to "ready"); for example, if you have a label that only appears in a single image, then it's best to leave that label out. There's random sampling between train / validation / testing so each time you make a modification in the dialog box, it's a new split.
train a model just by pointing to the directory of the dataset you just created. You really don't need to adjust anything here (I'll add instructions in the future); really only: imgsize, epochs, batch size need to be adjusted based on your computer. Use the default image classification model size.
deploy a model you just trained (best.pt); find the file, then load it, close the window after it says it's loaded. Note that it can only make predictions the labels it was trained on. If you load the model, and those labels aren't present in the labelwindow, then the predictions for them won't be made. You need to import the labels (stick with long label codes for now)

To use a model; you have three options:

select an annotation and press cntrl + z
press cntrl + z and all 'Review' labels in the image will be classified at once
go the sample annotations (after loading the model), and select the option to apply predictions

Predictions will show in the confidence window when the annotation is selected (if you move or adjust their size, you'll get a warning, as once you change a prediction, it doesn't really hold true anymore). I recommend switching to left hand on cntrl + arrows, and right hand on mouse. Use cntrl + left / right to cycle through all annotation on an image, and use the mouse to select any of the predicted categories, or any other label in the label window. To switch to the next image, do cntrl + up / down.

Last thing to note is the model: still with the smallest (tiny / t) model, and you can train even on a CPU in a reasonable amount of time, and inference is fast. The models are coming from ultralytics, so please make yourself familiar with their license.

m-h-williams commented 2 months ago

@Jordan-Pierce I used the machine learning option in the annotate tool, and it worked great! I’d love to set up a call to go over it in more detail. I’ll email you to arrange a time.

I have a couple of small suggestions:

Could we have the option to specify where we want the model to be saved? I was able to define the locations for the test, train, and validation folders during training, but the model itself was saved elsewhere—specifically, in your GitHub repo under /Data/Training/. It would be really helpful if we could choose the save location for the model, just like we can specify the name of the folder for training.
Additionally, it would be great if we could also specify the model name when saving it.

I'll definitely familiarize myself with the models from Ultralytics and their license. If there’s anything else I should be aware of or any additional resources you recommend, please let me know.

Thanks again for all the work you’ve put into these updates!