Kitware / dive

Media annotation and analysis tools for web and desktop. Get started at https://viame.kitware.com
https://kitware.github.io/dive
Apache License 2.0
81 stars 21 forks source link

[FEATURE] Pipeline/Training Manifest for descriptions #557

Open BryonLewis opened 3 years ago

BryonLewis commented 3 years ago

Gathering some requirements and ideas for a structure to hold the json pipelines as well as the possibly the training configs for the system

My guess at requirements:

BryonLewis commented 3 years ago

Categories:

Each pipeline/training config:

Description:

BryonLewis commented 3 years ago

I made a very horrible attempt here:

{
    "argumentTypes": {
        "kwiverPipeline": {
            "platforms": ["web", "desktop"],
            "supports": [
                "image-sequence",
                "video"
            ],
            "arguments": {
                "global": {
                    "pipeline": {
                        "arg": "-p {pipeline}",
                        "type": "pipe"
                    },
                    "input_file": {
                        "type": "txt",
                        "description": "Either list of images for the folder or a single input video file",
                        "arg": "-s input:video_filename={input_file}",
                    },
                    "detector_output_file": {
                        "arg": "-s detector_writer:file_name={detector_output_file}",
                        "description": "Output for the detections"
                    },
                    "track_output_file": {
                        "type": "viame_csv",
                        "arg": "-s track_writer:file_name={track_output_file}",
                    }
                },
                "video": {
                    "video_reader": {
                        "arg": "-s input:video_reader:type=vidl_ffmpeg",
                        "description": "additional input required for video reading",
                    }
                }
            }
        },
        "kwiverPipelineInput": {
            "platforms": ["web", "windows", "linux", "mac"],
            "supports": [
                "image-sequence"
            ], //Right now it only works on image sequences
            "arguments": {
                "global": {
                    "detection_input_file":{
                        "arg": "-s detection_reader:file_name={detection_input_file}",
                        "type": "viame_csv"
                    },
                    "track_input_file": {
                        "arg": "-s track_reader:file_name={track_input_file}",
                        "type": "viame_csv"
                    }        
                }
            }
        },
        "kwiverTraining": {
            "platforms": ["web", "desktop"],
            "supports": [
                "image-sequence",
                "video"
            ],
            "arguments": {
                "global": {
                    "input_folder_file_list": {
                        "arg": "-il {input_folder_file_list}",
                        "type":"txt",
                        "description":"List of folders and/or videos used for training inputs",
                    },
                    "input_ground_truth_list": {
                        "arg": "-it {input_ground_truth_list}",
                        "description": "List of the groundTruth viame_csv files for each item in the folder_file_list"
                    },
                    "training_configuration_file": {
                        "arg": "-c {training_configuration_file}",
                        "type": "conf"
                    },
                    "prevent_interrruptions": {
                        "arg": "--no-query",
                        "description": "Argument to prevent training from needing user input"
                    }
                },
                "desktop": {
                    "no-adv-prints": {
                        "arg": "--no-adv-prints",
                        "description": "Changes console logging style for windows"
                    },
                    "no-embedded-pipe": {
                        "arg": "--no-embedded-pipe",
                        "description": "Desktop requirement to train pipelines properly."
                    }
                }
            }
        }
    },
    "Pipes": [
        {
            "name": "Generic Tracker",
            "filename": "tracker_generic.pipe",
            "backend": "kwiver", // Future other backend pipes that can be run
            "argumentTypes": [
                "kwiverPipeline"
            ],
            "description": "Generic tracker which returns tracks with confidence.  Maybe some more about the system",
            "types": [
                "vertebrate",
                "invertebrate"
            ], //These might not match up but you get the idea
            "category": "Trackers", // Trackers | Detectors | Training | Utilities
            "tags": [
                "generic",
                "tracker"
            ], // Hopefully used in the future for filtering like removing 'netharn' or 'local' or 'svm',
            "requirements": {
                "gpu": false,
                "gpuMinMemoryGBs": "1" // 0 is obviously who cares
            }
        },
        {
            "name": "User Input Detections Tracker",
            "filename": "tracker_track_user_selections.pipe",
            "backend": "kwiver", // Future other backend pipes that can be run
            "argumentTypes": [
                "kwiverPipeline",
                "kwiverPipelineInput"
            ], //It takes both the kwiverPipeline and kwiverPipelineInput args
            "description": "Provide detections on first frame of objects to produce tracking results",
            "types": [], // None for this pipeline
            "category": "Utilities", // I don't know if this is a utility or not
            "tags": [
                "input",
                "tracker"
            ], // Hopefully used in the future for filtering like removing 'netharn' or 'local' or 'svm',
            "requirements": {
                "gpu": true,
                "gpuMinMemoryGBs": "1", //Optional 0 is obviously who cares
            }
        },
        {
            "name": "Train Netharn Cascade",
            "filename": "train_netharn_cascade.viame_csv.conf",
            "backend": "kwiver", // Future other backend pipes that can be run
            "argumentTypes": [
                "kwiverTraining"
            ],
            "description": "Takes input detections and trains a model of the corresponding types",
            "types": [], 
            "category": "Training", // I don't know if this is a utility or not
            "tags": [
                "training",
                "netharn",
                "cascade",
                "viame_csv"
            ], 
            "requirements": {
                "gpu": true,
                "gpuMinMemoryGBs": "2", //Optional
            }
        }
    ]
}
subdavis commented 3 years ago

This is a little heavier than I think we need right now. I'm worried that this may be over-engineered, but I'm certain that this change takes complexity represented in code and moves that complexity into a configuration scheme.

I'd rather deal with complexity in code because I think TypeScript is safer and more expressive than JSON, and provides better tools to manage that complexity. We can also absorb changes much more easily in code.

It also has a few features that control things we don't support (and have no plan to support).

My proposal would be to...

The rest of the pipes schema looks great. tags, types, etc. are all awesome. I'd like to follow up about some external factors, like were to look for this file.

This isn't a total rejection of argumentTypes, but I don't think there's a strong case for them yet. This could be broken into multiple parts, where we first implement argumentTypes in TS, then later, if that choice appears to be causing problems, we migrate that complexity into config.

What do you think?

BryonLewis commented 3 years ago

I think my goal was just to have argumentTypes defined someplace that was a union between different sets of arguments for pipes, datatypes (video/image-sequence) and platforms (web/dekstop) without having to look through each individual pipe for their parameters. It living in the TS and Python backends I think is a better as long as we have some good documentation for what each type is expecting/requires.

f4str commented 3 years ago

@subdavis seems to be completed, can you verify?

subdavis commented 3 years ago

That PR was related, but does not fix the actual topic of this PR. We still have a lot of work to do to come up with a way to reliably describe the configurations available in DIVE.

readicculus commented 3 years ago

@subdavis Just want to add some thoughts to the code vs. json schema discussion. I don't have input on the specifics of implementation besides just mentioning that I think it would be useful to consider how custom pipelines specific to a group or project fit in. Looking at the add-on packages now available in VIAME I think its clear that there is a need for projects to be able to use their own pipelines with these interfaces and a good argument to be made for a way for people to have custom pipelines in DIVE. Though this is not totally within the subject of this issue I think it could be useful to consider in this work as it could make it easier/harder for more support of custom pipelines to be added in the future. A few notes:

Just some thoughts. Even though we're moving towards running pipelines elsewhere I do thing there is benefit to having some of our pipelines in DIVE and I imagine other groups with VIAME add-ons may feel similarly if not now, then likely at some point.

Edit: Also I'm only talking about processing pipelines like detectors, trackers, etc... not training pipelines

subdavis commented 2 years ago

I'd sort of like to abstract this to have some "function (pipeline_name: str) -> Capabilities and Requirements Dict" Right now, that's based on name, but in the future, we can incorporate other out-of-band knowledge.