clamsproject / app-swt-detection

CLAMS app for detecting scenes with text from video input
Apache License 2.0
1 stars 0 forks source link

future of the stitcher module #117

Closed keighrim closed 3 weeks ago

keighrim commented 2 months ago

Because

As we started a separate evaluation (https://github.com/clamsproject/aapb-evaluations/issues/60) on the stitcher performance (that's somewhat independent from the performance of the CV models for image classification), we want to decide the future of the stitcher module in this app, and the app's relation to the standalone stitcher app (https://github.com/clamsproject/app-simple-timepoints-stitcher/).

Done when

Additional context

This issue was briefly discussed in the past when we were working on https://github.com/clamsproject/app-swt-detection/issues/96. See this comment https://github.com/clamsproject/aapb-evaluations/issues/60#issuecomment-2311132415 for preliminary conclusions from some recent discussion. Also, copied from some exchanges over our slack channel.


@owencking : Following up on a discussion that @keighrim and I had at the end (well, after the end) of our call today... We were talking about how TimePoint -> TimeFrame stitching functionality should (or should not) be integrated into other CLAMS apps and how that is reflected in MMIF. There were a few issues. First, whether to separate views in output MMIF: Keigh said -- and I fully agree -- that SWT should put its TimePoint annotations and TimeFrame annotations in separate views. This makes sense because we could have multiple sets of TimeFrame annotations that refer to a single set of TimePoint annotations. The other questions were about whether the stitcher should be integrated into SWT or stand-alone. Option 1: SWT produces only TimePoint annotations and does no stitching. In addition, a separate stitching app does stitching and produces views with TimeFrame annotations Option 2: SWT has the stitching functionality built in. SWT can be run in 3 modes: CV-only mode (no stitching), Stich-only mode (no CV), and CV+stitching (presumably, the default mode) Option 3: SWT always runs the CV step and has its own stitcher, which can be turned off. There is also a separate stitcher app. I think Keigh and I agree that Option 3 (which is what we have now) is the worst option, because it involves maintaining two different implementations of the same stitching algorithm. So, which is better, Option 1 or Option 2? During our call, I was thinking Option 1 was better because it seemed more CLAMS-like and more Unix-like: Each CLAMS app does one thing well, and the apps are composable to create complex functions. Also, it would allow us to have a choice of multiple different stitching apps on top of SWT's TimePoint annotations. Well, I took a walk in the evening and thought about it. Upon reflection, I am pretty convinced that Option 2 is preferable. I have one practical reason and one conceptual reason for this. The practical reason is that SWT will be the workhorse for a lot of GBH work, and having a single app for TimePoints and TimeFrames will make our lives much easier! Plus, there is a small (but significant) performance cost to calling more two Dockerized app instead of just a single one. The conceptual reason is that the stitching algorithm we prefer might be somewhat specific to SWT. As Keigh explained to me, the stitching used for the audio tagging app is quite different. SWT-style stitching may or may not be the best general-purpose stitching algorithm. So, maybe it makes sense, as a conceptual matter, for the stitcher to be integrated into the app that finds the TimePoints that need to be stitched. What do you think? Option 1 or Option 2?


@keighrim : I have a very strong opinion for the option1 for reasons like

  1. conceptual clarity of having two apps (SWT, stitcher) generating two views (TP view, TF view, respectively)
  2. conceptual clarity when you use the (same) stitcher for other SWT-like classification apps. For example, imagine we have a binary classifier to detect "screen bugs", the following pipelines doing the same job, but think about their naming.
    • running screenbug-detector (TP annotations) -> SWT-in-tfmode (TF annotations) vs.
    • running screenbug-detector (TP annotations) -> timepoint-stitcher (TF annotations)
  3. easiness of replacing the smoothing/stitching algorithm with a future implementation. Namely, when we want to try out a different smoothing algorithm (https://en.wikipedia.org/wiki/Smoothing) , it's much (technically and conceptually) easier to use a new stitching app, instead of adding a new stitching "mode" (probably with a completely different argument structure of parameters and configs) to the all-in-one SWT app.

However, for option2,

  1. majority of my argumentation for option1 above is for the future that's not happened and might not be happening at all after all
  2. I can't ignore the user-friendliness of having an all-on-one app that supports a single call to generate all relevant annotation. From the pure practicality perspective, I think just this reason outweighs all the "conceptual" justification of mine in the above.
  3. finally, the main reason that made me reluctant to separate stitcher from the SWT in our past consideration (https://github.com/clamsproject/clams-python/issues/197) is still valid. Namely, the knowledge of "postbin" configuration is very tightly coupled with the design of SWT annotation and CV model, but if we separate stitcher app, the user needs to have that knowledge to properly configure the stitcher app (with labelMap= parameter) to chain it with the SWT. This problem is now somewhat mitigated after labelMapPreset= parameter was added, but that's only a minor technical remedy, and can't solve the conceptual problem of leaking information between two components.

For the record, I'm still open to option3, as long as I'm not the one who maintains/manages the two (duplicate) stitchers in two codebases.


@owencking : Well, I do think we see all the pros and cons the same way! One reason I'm still favoring Option 2 is that it doesn't block us from future innovation with stitching. (In other words, neither Option 1 nor Option 2 blocks further development/innovation.) In particular, I am okay with the idea that SWT ships with its default stitcher (which works pretty well), but users (including us) have the option of turning it off and using a separate stand-alone stitcher. Maybe that is slightly inelegant. But it's not too bad, right?

keighrim commented 3 weeks ago

121 addresses the changes discussed here.

More specifically, the PR makes these changes: