Feature List of EVERYTHING

A Feature List of everything being added in the short to long term future.

Suggestions Welcome

I will move things off this list into individual tickets to be worked on as time permits.:

- new models for tagging and/or captioning
- new augmentation options
- new image board support (anime & gelbooru)
- two new image tagging modes for both tag based models and caption based models
- a new UI for both modes
- a new layout
- a better way of outlining tags and words in different categories
- faster response times with tag/word suggestions
------------
- segmentation & object detection model/s integration
- a new feature to automatically curate data for the user according to a few questions the user has to answer to determine the heuristics of the model
- cogVLM support
- LLM support for optionally improving captions i.e. cogVLM models struggle with NSFW tagging, so it might be better to use tags resulting from a tagging model and have an LLM re-format the tags into a caption; and then as a second pass have the VLM act as validation to those captions. And/or use an LLM to also merge the LLM caption and the cogVLM caption together into 1 singular caption that takes certain semantics of both captions to make a better one.
- more user configurable options to caption their data
- new methods of pruning tags, again based on various heuristics from Q&A with the user/s and an LLM and/or VLM
- a new WIKI for the tutorials
- super resolution model & denoising model support to (remove) noise artifacts or adversarial attacks from images the user may want to train with; i.e. mitigating some of the effects of Nightshade & Glaze
---------
- new tag, image, & caption statistics and visualization tools for the user to use from a data scientist perspective on how to best choose their data and augment their data
- a custom trained vision classifier on images with [nightshade, glaze, both, nothing]. To allow the user to know which data has been poisoned by artists etc. and if they need to be de-noised / upscaled on to mitigate the effects to some extent
---------
- tagging/captioning model/s will be downloaded on the fly if the option on the dropdown menu is selected, instead of having to go to the download tab to grab it beforehand.

[X] Update the existing Version_3 WebUI WIKI Page for the Version_4 WebUI
[X] Finish Code Refactor
[X] Conda setup instructions
[X] CSV load time optimization with the pandas framework
[X] .sh & .bat installer scripts for conda
[X] Image Board manager class object
[X] PNG Info & tag combination options

NEW Features Paused as of (09/05/2023) :: unless there are willing contributors to develop any of the other features.

New image board specific tagging/captioning models will be supported as they are released :: (There is "no" current eta. on the progress of those models being developed by others)

Contributors are welcome to open a Pull Request for their developments & I will promptly review it to be added

[ ] Add Aliases for tags suggestions in the textboxes
[ ] Add Support for brand new tag & captioning models & tag combining options
- deepdanbooru
- huggingface IDEFICS (api call)
- gpt-4 (api call)
[ ] Add Auto-caption feature using various heuristics to determine from each auto-tag/caption model; what tags are best
[ ] Include support for a variation of different public image boards
[ ] Add De-Noise & Upscale Models, e.g. StableSR
[ ] Add Segmentation & Detection Models, e.g. SegmentAnything-HQ
[ ] Add Cross Attention Visualization DAAM
[ ] Add Grad-CAM
[ ] Add UMAP
[ ] Color code tag categories for tag suggestions in the dropdown menus (blocked : https://github.com/gradio-app/gradio/issues/4988)

x-CK-x / Dataset-Curation-Tool

Feature List of EVERYTHING #36