Open dobkeratops opened 6 years ago
here's a quick attempt to 'tree-ify' the current labels plus the ones I remember using on your site; however this is only 200. I seem to remember a while back counting 1000+ in your existing image set. Nonetheless, a tree of 200 would be better than nothing, r.e. managing free labels..
rationale here - I list 'object part' and hence 'wheel' etc since you could see some of those independently, I'm also interested in the idea of open-labelling with heirachy (I still remember the label-me heirarchy as being awesome, especially in light of capsules; with free labelling you could still use the words 'wheel' , 'leg' etc and make some sense of things even if you haven't fully categorised it yet).
object
object part
wheel
wheel of car
wheel of bicycle
headlight
tail light
head
head of person
head of dog
head of cat
torso
part of head
eye
eye of dog
eye of person
eye of cat
mouth
mouth of dog
mouth of person
nose
tire
tire of bicycle
tire of car
paw
beak
leg
leg of person
leg of animal
leg of cat
leg of dog
arm
arm of person
hand
hand of person
tail
tail of aircraft
tail of dog
wing
wing of aircraft
wing of bird
wing of insect
door
door of car
door of building
window
window of car
window of building
roof
roof of car
roof of building
pallet
wooden pallet
animal
vertebrate
mamal
dog
cat
sheep
cow
rabbit
mouse
rat
pig
bird
reptile
snake
lizard
crocodile
tortoise
invertebrate
insect
arthropod
molusc
worm
construction materials
pile of bricks
plant
vegetation
hedge
tree
bush
grass
vines
moss
flower
machine (not vehicle)
construction machine
digger
excavator
bulldozer
backhoe loader
loader
forklift
agricultural vehicle
tractor
combine harvester
road vehicle
wheeled vehicle
car
saloon car
minivan
van
truck
semi truck
semi trailer
pickup truck
2 wheeled vehicle
motorbike
bicycle
road bike
city bike
mountain bike
bmx bike
delivery bot
aircraft
aeroplane
passenger jet
helicopter
quadcopter
boat
ship
ferry
cruise ship
cargo ship
canoe
wheelbarrow
consumer electronics
wristwatch
smartphone
laptop
tv
monitor
tool
knife
table knife
kitchen knife
carving knife
fork
table fork
pitch fork
pliers
hammer
chissel
sheers
pruner
spade
drill
static man made structure
building
residential building
apartment block
house
warehouse
factory
multistorey car park
tower
radio tower
restaurant
cafe
shop
supermarket
religious building
church
cathedral
mosque
synagogue
wall
brick wall
bus shelter
telephone box
utility pole
electricity pylon
lamp post = street light
statue
monument
sculpture
bridge
footbridge
overpass
person
man
woman
baby
child
boy
girl
container
shipping container
cylindrical container
barrel
waste container
bin
street bin = litter bin
waste paper basket
kitchen bin
wheelie bin
skip
teapot
drinkware
cup
coffee cup
tea cup
wineglass
plate
dinner plate
paper
bound paper pages
book
magazine
newspaper
notepad
sports and excercise equipment
tennis racket
ball
tennis ball
golf ball
football
basketball
golf club
excercise equipment
dumbell
barbell
clothing
headwear
hat
asian conical hat
cowboy hat
beanie hat
helmet
bicycle helmet
scarf
shoes
sandals
personal items
spectacles
sign
road sign
traffic sign
direction sign
furniture
table
dinner table
desk
chest of drawers
wardrobe
bed
chair
bench
picnic bench
traffic cone
traffic island
bollard
wooden bollard
metal bollard
traffic lights
satellite dish
litter
ladder
ramp
steps
pipe
drainpipe
barrier
temporary barrier
plastic barrier
concrete barrier
metal barrier
fence
metal fence
wooden fence
wall
brick wall
stone wall
railing
food
sandwich
bread
pizza
hamburger
vegtable
tomato
brocoli
mushroom
leek
cabbage
peas
beans
fruit
apple
banana
orange
grapefruit
peach
grapes
berry
strawberry
blueberry
raspberry
ground surface
outdoor ground surface
sand
soil
water
lake
river
sea
puddle
canal
road
paved area
pavement
cobblestone
path
rock surface
carpet
wood flooring
A general graph has the hazard of cycles.. there needs to be verification to prevent that; but if you're strictly logical about the ordering of 'X is a type of Y' , there should be no problems
yeah, right.
here's a quick attempt to 'tree-ify' the current labels plus the ones I remember using on your site; however this is only 200. I seem to remember a while back counting 1000+ in your existing image set. Nonetheless, a tree of 200 would be better than nothing, r.e. managing free labels..
Awesome!
One thing I am wondering is: Would it be possible to model a labels graph without making it a hierarchical structure? With the trending labels concept I guess we might stumble across new labels that we completely forgot while modelling the graph. In case those labels add a new hierarchy level we would need to change the parent of a lot of nodes (which could be pretty expensive with a lot of labels in place).
the problem remindes me a bit of the open/closed principle in software design. Ideally, when adding a new label, we shouldn't need to touch existing ones (or at least keep the change to a minimum). Not sure if there is a way to accomplish that, but from the label maintenance perspective that would be a dream ;)
A hierarchical structure also requires us to come up with a pretty good label structure that we can build upon. I am wondering if there is possibility to flatten out the label hierarchy in order to make it possible to add a new hierarchy level without affecting existing nodes. This would give us the flexibility to add label per label.
Another thing is PostgreSQL: I have not much experience with PostgreSQL and hierarchical/recursive queries. I guess it could be pretty expensive to lay out the hierarchical representation in memory. I am wondering if there are better ways to model this.
without making it a hierarchical structure?
right you hit the nail on the head r.e. the problem. A simple hierarchical tree structure would be a stopgap, but often 'something that works' is better than nothing. gradual progress. The full versatile solution would be a generalised graph of 'X isa Y' relations such that you could say "aircraft isa vehicle", "bird isa flying_object", "bird isa animal" "aircraft isa flying_object" etc. That would also help customising 'branching CNNs' even further (did you want to divide animals from plants, or airborn objects from ground objects?)
r.e. SQL, I think those 'edges' would fit in a database? even seeing that extra complexity to figure out - my gut instinct would be to do a hardcode a simple tree (go source code literals, serving the whole tree to the client? few Kb's of data) as a stopgap .. then you've got a reasonable compromise r.e. the fidelity of labelling and people still being able to grab stuff.
Of course any user such as myself can manually inspect all the labels and preprocess the data already (its not a big job to pick out the labels you want from a flat list unless we get over 1000.. ) .. and we could share such lists on GitHub. So, perhaps I'm worrying about nothing.
I guess I hoped a tree structure as a starting point could shepherd people toward existing labels rather than inventing many adhoc synonyms. Some of the objects out there can be described many ways.. in trying to label a specific 'barrier' type, i found 'plastic barrier', but also a specific type 'avalon barrier' which was exactly what I was looking at. And so on..
ah incidentally one important point r.e. the heirarchy and parts:-
perhaps if the image-annotations really were stored with nesting, you could avoid needing seperate labels eye/dog
, eye/person
etc , and just store the label eye
, and figure out it's/dog
from the structure. I can see that helping with free labelling because some objects might be given obscure specific names, but they've still got common parts (many animals have 'eye', many mechanical objects have 'wheels' ,etc); and that would open yet another search ('show me all the things with wings..', 'show me all the things with legs or wheels')
Of course any user such as myself can manually inspect all the labels and preprocess the data already (its not a big job to pick out the labels you want from a flat list unless we get over 1000.. ) .. and we could share such lists on GitHub. So, perhaps I'm worrying about nothing.
you raised an interesting point here. What about a mixture of those two attempts?
In the database we store the label with it's direct parent, which (hopefully) already is enough to uniquely identify a label. e.q: lets assume we have the label eye
. In order to distinct between a persons eye and a dogs eye, we store its parent (i.e person/dog). Everything else will be handled by an (in-memory) tree-microservice. You can ask the microservice for every label and it will give you a list of all the childrens, which then can be used to query the database. e.q: Let's assume we want to query the dataset for the label animal
. It would first traverse the in-memory tree and return a list of all labels (dog
, cat
, monkey
..). Those labels will then be used to query the database. That way we could keep the hierarchy level in the database to a minimum (maybe one parent is enough) and offload everything to a microservice/binary. I guess it's not that likely, that the direct parent will change...and as the remaining labels are handled by a separate in-memory tree we won't need to touch the label structure in the database when we introduce a new label. (e.q we could introduce the label mammal
or placental mammals.
without touching the database)
I'll have to think some more about that.. lets see what the label list actually looks like over the next few days.
hope: if the label list explodes, we post on github something categorizing 1000. (my past go experiment did about that). perhaps the frequency of completely new labels would be low enough that you dont miss much data.
A few more advantages of the separate in-memory tree approach:
hopefully we don't have to change the label hierarchy in the database that often. Mirroring the json label structure to the database can be quite challenging, especially with a lot of nested labels. Another thing is the migration: if we change the labels position in the label.json file we also need those changes reflected in the database structure, which could be quite challening to do generically. If we have a bug in our migration logic,we could easily render a lot of data useless (as images might be tagged with the wrong label)
testability: a in-memory tree graph is probably easier to (unit-) test, than code that depends on a database backend.
we don't need to write migration code. if the label structure changes, we just completely re-build the in-memory tree.
we can keep the label structure in the database pretty flat.
(is there something to query the current labels - including the freely applied ones; I notice they do show up in autocomplete; Having played a little with the python upload script, I can imagine this sort of thing will be easy to experiment with via the APIs)
At the moment there is no unified API endpoint that returns both the freely applied labels and the static ones. There is this endpoint which returns a list ofof all the base labels from the labels.json
. And this endpoint which also returns the quiz related labels. And of course the trending labels repository. But I guess it makes sense to scramble all those labels together in an API call.
EDIT (ok i remember metalabels and I see you have a JSON structure in place already , let me read that again. I see it directs the 'quiz mode' but it's a fixed structure: metalabel->label->quiz refinement answers)
So now that you have free labelling , I think that raises the issue of a label graph again , so that people can still get a useful subset - (e.g. if someone goes labelling 'SUV, saloon, hatchback' etc... you still want to be able to ask for "car"including all those)
I think that's essential for dynamic growth, i.e. at any moment in time you probably have to peel back the depth of classifications possible to know you have enough examples to train -whilst having the database in a state that is continuously increasing it's depth. That could also be a differentiator from the existing curated challenges.
It would be a whole new set of UI to manage that in general .. as a shortcut without that, maybe we could periodically update a curated label graph. I did that simple experiment in go (hard coding one), I could try to amend it based on what I've labelled here so far and see what it looks like.
A general UI would be awesome to have though..
A general graph has the hazard of cycles.. there needs to be verification to prevent that; but if you're strictly logical about the ordering of 'X is a type of Y' , there should be no problems
A simpler label tree might also do the trick, it's just a bit harder to figure out a decent set of roots, less adaptive. But it might be an idea to start just by defining one of those (i think i saw a tree in COCOs we could copy?), looking at the labels we actually used so far.
EDIT: compared to the existing quiz structure, I imagine the next step being a label-tree: any label could be 'quizzed' to refine to it's current children .. vehicle->car->convertible -> porsche boxster plant -> tree -> oak tree etc.