Open dobkeratops opened 4 years ago
Interesting idea! :D
Do you think the thumbnails would be big enough to identify the individual objects?
This would let a keyboard user flick through and add labels with hands 100% on the keyboard , it would be blindingly fast :)
Totally agreed, a keyboard only workflow would be really nice!
Something completely different: What do you think about integrating an ImageMonkey backend into existing annotation tools? Would it be worth the effort?
Tangentially I would also suggest a +/- zoom option for the thumbnails but as they stand they’re usually big enough to see a lot of the content - even on the iPad, which has a nice HDPI display. Even if you just assign a scene/theme label you’ve turned the unlabeled image into a trainable sample. The benefit of the dedicated screen is of course the suggestions (which could eventually be dynamic)
Re imagemonkey back end in existing annotation tools - I guess the advantage is if people are already used to those tools.
Ultimately it comes down to your own focus - do you prefer on a balance of all features with imagemonkey being a standalone platform , or do you prefer to defer advanced annotating and concentrate on the integrated training etc.. (which I hadn’t looked into until thinking about rendering)
A way to get annotations from a paint tool might be interesting ( eg a way to tie a photo to colour coded scribbles) .. I say that because I like doodling on the iPad with procreate
Maybe this could help allow searching unlabelled for unified mode?
Ie if you click on an unlabeled image in the unified search, fire up an add label prompt()
first .. then the unified screen wouldn’t have to cope at all with the state of no available label
Just experimented a bit with vim bindings (aka "keyboard only mode"). One of the problems I currently have is, that I don't know where to best place the label input field.
my requirements are:
Any suggestions?
Hmm . In theory.. it’s either a pop up input box placed above or below the highlighted image (chosen depending on current screen position) , or a pane that doesn’t scroll with the rest (imagine if the search box itself stayed visible along with the top menu, and the images scrolled below them; a label input box or current label list could be displayed in there. Actually this ties into another possible suggestion for a bulk tagging mode that could grow into a replacement for verification.. let me explain with a mock-up, incoming ..
Actually this ties into another possible suggestion for a bulk tagging mode that could grow into a replacement for verification..
on a related note: For refinements there exists a mode for bulk tagging, see: https://imagemonkey.io/refine?mode=browse
e.g: tick the "smart refinement" box, type person
in the first input field, select male
in the second input field and press Go
. Then select every image where a male person is annotated. This allows to add the male
property bulkwise to existing annotations. But of course it's not that flexible.
This is a tangential related suggestion .. this duscussion reminded me of it. It’s another way to accelerate label entry, but probably more of a departure. The idea would be to be able to set the “add..” label at any time whilst scrolling through a search, and it would show you which images do or don’t have the second queried label. Set them the same, and you’d have a bulk validation mode.. (click the odd ones out to invalidate)
I would admit this sounds quite complicated ,it might need more thought
Ah I see the “smart refinement” mode is very similar to this “bulk tagging mode” already (I wasn’t aware of it) . It could handle this “car -> sportscar” example . perhaps if graph enabled things like “animal -> dog” would generalise it further. Ok so back to the problem of where to put the input box (The above tangent of “bulk labelling” is perhaps a seperate suggestion to generalise smart refinement instead..)
A simpler idea to speed up keyboard only use is just hotkeys for “done”, and for focussing the existing text entry box in the label entry view. Perhaps just pressing enter or space or any alphanumeric key without a modifier could focus the textentry box
That alone would be quite a speed up (saves 2 mouse-aiming tasks.. clicking on images in search isn’t so bad because they’re quite large)
Ah I see the “smart refinement” mode is very similar to this “bulk tagging mode” already (I wasn’t aware of it) . It could handle this “car -> sportscar” example . perhaps if graph enabled things like “animal -> dog” would generalise it further.
yeah, right. I created it mainly as a PoC, but if it is useful we could definitely extend that further. I think it could e.g be used to refine wall
, table
, fence
etc with material properties like stone
, wooden
, metal
, etc. The only thing is, that the actual object needs to be annotated first (as there could exist multiple objects with different properties. e.g wooden table
, plastic table
etc)
So a possible workflow could look like this:
table
That's of course a bit of an overkill, if you do it for just one image, but if you've annotated hundreds of tables, I think the refinement could really be speed up that way.
A simpler idea to speed up keyboard only use is just hotkeys for “done”, and for focussing the existing text entry box in the label entry view.
That's a nice idea - like it! (we can always add the floating input box later to further speed up things). Do you have a hotkey suggestion for the "Done" button? We already have the shift+enter
to add a label. Maybe ctrl+enter
for the "Done" button or is that too confusing?
I am not sure where the ctrl
key on an UK keyboard is located, but at least on a german keyboard shift
and ctrl
are nearby...which could mess up the muscle memory. On the other hand...we could always make the hotkey configureable..so we could just pick a key combination that's ideal for a specific user group. And if the key doesn't work for you, you have to change the default setting. If you've any suggestions, please let me know :)
Another (completely different and more technical approach) would be to add vim bindings. There seems to be two groups of people out there: the ones that love vim and the ones that hate vim. Not sure which group you belong too ;)
An example for vim bindings in the browse based label mode:
i
Esc
and :wq
to save your changes (would be similar to pressing the "done" button") and move back to the image grid, where you can select the next image:w
instead of :wq
:+
to zoom into the image? i
again.I think there wouldn't be a lot of users out there who would use such a mode, but I think it could be highly efficient for keyboard warriors. And I guess it also depends whether you are used to vim or not.
Not sure if is is a good idea...just wanted to through it out :)
Ctrl and shift are close to eachother on all keyboard layouts as far as I’m aware - certainly on U.K. and USA layouts -( I just checked what German looks like, Strg
is in the same places I see ctrl
)
I’m not personally a vim user, but you cover half the developer population with those. They might not be mutually exclusive. So if vim users can find chorded commands starting with :
, we just need to not use : for anything else . It should be ok. I note vim still lets you use the colon in “insert mode” ( the equivalent of being in a text entry box). In vim “esc” toggles between Command and Insert mode? So a vim user might expect esc to toggle highlighting a text entry box? Again that can work fine alongside other bindings so long as we don’t need esc for something else. ( I tend to stick with Windows style bindings . I used emacs a lot but changed its layout to use ctrl-s save ctrl x c v for cut copy paste , etc etc)
Continuing the analogy of a text editor, I see vim :wq
is a combined “write and quit”, which does make more sense than plain save.
I just pressed ctrl-S to check, and it tells the browser to save the page .. so that’s not an option. “Alt-S” maybe?
As for focusing the entry box , it could be any unbound key. unmodified enter perhaps (I think space is used to press a highlight button so it can’t be that)
Is it possible to tweak the button order for tab cycling ( which the browser does automatically) - what I currently see is from the edit box you can press tab to highlight “add”. (And shift tab to go back) currently “done” seems inacessible via tab (maybe that was deliberate to avoid accidental press?).
If it was “tab tab space” ( move the highlighted button twice from the edit box) that would be useable enough I think (you could discover it through visual feedback of the highlight moving around)
I note vim still lets you use the colon in “insert mode” ( the equivalent of being in a text entry box). In vim “esc” toggles between Command and Insert mode?
yeah, right. So pressing i
(for i
nsert) would put focus on the label field (which is also the default state right after the page loads). When you are in insert mode, you can type any character. When you press Esc
the focus will be removed from the label input field and the status bar at the bottom gets focus.
The reason why I think that vim bindings could be nice is, that we could continually extend the status bar with functionality (possibility to delete labels, help command which lists all available commands, command to show all existing bounding boxes for that image, etc). By using the status bar as control, we could get rid of some buttons and make more space for the image.
I would make the vim bindings optional. If you aren't using them, the label view should also be usable via keyboard only (so e.g "Done" mapped to alt+s
)
I just pressed ctrl-S to check, and it tells the browser to save the page .. so that’s not an option. “Alt-S” maybe?
I guess that would work :)
Is it possible to tweak the button order for tab cycling ( which the browser does automatically) - what I currently see is from the edit box you can press tab to highlight “add”. (And shift tab to go back) currently “done” seems inacessible via tab (maybe that was deliberate to avoid accidental press?). If it was “tab tab space” ( move the highlighted button twice from the edit box) that would be useable enough I think (you could discover it through visual feedback of the highlight moving around)
Good idea, I'll have a look :)
Well esc toggling out of the entry box sounds fine to me - I bet you can have vim bindings by default , and they’ll work fine alongside other shortcuts. “i” to insert labels (both in labels and unified?) would be fine. Maybe the entry box could even respond directly to :wq ?
Vim seems unusual for a text editor where most people are used to something modeless, but with imagemonkey screen space divided between images,drawing tools,other controls and an entry box, it makes more sense .
Here’s a little mockup suggestion eg “done” next in tab order beside the entry box .. that might be quite easy to discover, although I admit “add” being in the left is a bit weird. (My reasoning is you could write many labels comma seperated , then just do tab+space to hit done; if done automatically added the labels aswell, you wouldn’t need something seperate for add. If you could click back on the greyed out images in the browse view (ie greyed out tells you that you already clicked it, but doesn’t stop you clicking again) you could still correct a mistake .
I've added a few changes:
"alt+enter" is now mapped to the "done" button. (I've tried "alt+s" first, but this was already mapped to some other functionality on my system). Please let me know, if "alt+enter" works for you.
I've added a simple vim mode (it doesn't have much features yet..so it's purely experimental for now). vim mode is off by default and can be enabled in the settings. The setting is, similar to all the other settings, stored in the browsers local storage.
Per default the label input textfield has focus. You can as usual enter your labels and add them via the "shift+enter" hotkey. After you are done entering labels, press Esc
which removes the focus from the textfield and enter :wq
in the status bar at the bottom. If you've pressed Esc
, but want to switch back to the label input field, press i
again.
Please let me know what you think. I hope this changes makes it a bit more convenient to use for keyboard warriors.
That’s great . I see it focusing the input box by default , alt-enter works as “done”, and I see cursor keys navigating and scrolling through the browse view. You can indeed now just keep your hands on the keyboard.
I would suggest one little tweak to make it perfect - if alt+enter also added the current labels if the text box isn’t empty at the time you press it, you’d save key press,a finger contortion (Shift+enter -> alt+enter) and a mental step. But what’s there now is easily good enough to use.. label entry is now faster.
Further feedback- I started using "default-add labels" a bit more - with the recent tweaks its very fast. Previously it had a hazard that it could stick on one image type if someone did a bulk upload, but the order looks quite random , and it is tending to show unlabelled (whch is great because theyre the highest prioritiy) .
I'd re-iterate the idea of making "done"/"alt+enter" automatically add if the entry box is not empty, it saves the mistake if you forget to press 'add' , and streamlines one less keypress (the faster you make it, the more each keypress is noticable. you're approaching the limit where you just blast out the labels seperated by commas, and press one key to commit and move on)
Its still good to have the option to search, because you can find the most unusual examples. (eventually if you can make unified find the unlabelled images, the need for seperate search modes will be simplified out)
Previously it had a hazard that it could stick on one image type if someone did a bulk upload, but the order looks quite random , and it is tending to show unlabelled (whch is great because theyre the highest prioritiy) .
I recently made some changes to improve the randomness , great to hear that it has improved :) (First it is checked if there are unlabeled images - if so, a random unlabeled image is picked. If there are no unlabeled images, a random image is picked)
I would suggest one little tweak to make it perfect - if alt+enter also added the current labels if the text box isn’t empty at the time you press it, you’d save key press,a finger contortion (Shift+enter -> alt+enter) and a mental step.
yeah, that makes sense. I'll add that to my todo list, thanks for the suggestion! :)
I'll let you know once it is implemented. :)
I've noticed you can use the "/" and brackets in unified view now - thanks for that fix aswell!
Would it be possible to add a hotkey (advances user feature) to add labels directly in the search view, eg Imagine CTRL-click or right click or even a plain key (with the cursor over an image) press firing up a plain
prompt()
box showing the current labels, and asking for comma seperated labels to add; This would save the delay of serving the add labels page , and the UI click to hit the text entry box.At a further extreme , imagine if there was a highlighted image which you could move around with the cursor keys (and pressing enter would bring up the add label pop up).
This would let a keyboard user flick through and add labels with hands 100% on the keyboard , it would be blindingly fast :)