Shared-Reality-Lab / IMAGE-server

IMAGE project server components
Other
2 stars 7 forks source link

Please add to this list of weird preprocessor results #213

Closed jeffbl closed 1 year ago

jeffbl commented 2 years ago

When #194 is resolved, there will be a more formal way to log issues with preprocessors giving weird results, e.g., consistently wrong tags like "embassy". For now, please use this work item to report weird preprocessor results, with one graphic per comment. Please make sure to include:

Cybernide commented 2 years ago

https://images.fineartamerica.com/images/artworkimages/mediumlarge/3/old-montreal-cobblestone-street-scene-john-rizzuto.jpg "This picture of a parking garage contains 3 people 1 traffic light and 1 potted plant." This is not a parking garage, this is a photo of a street in Old Montreal

jeffbl commented 2 years ago

That first part of that is the scene recognizer, which is known bad, and is being removed from the build immediately. So I think we should focus on just the part beyond the "this picture of a SOMETHING" section, which is typically object detection.

Cybernide commented 2 years ago

Got it. Will ignore scene recognizer

jeffbl commented 2 years ago

https://en.wikipedia.org/wiki/Cooking#/media/File:Agdz-rosino-05.jpg This picture of an attic contains 2 oranges 5 bowls 1 dining table and 1 person. EXPECT: No oranges, but finds more kitchen items, e.g., skewers, eggs, kettle, food cans...

after #209 20211213: This picture contains 5 bowls 3 cups 2 oranges 1 spoon 1 dining table and 1 person.

after testing on a bigger model(yolov5s6): 4 bowls, 3 cups, 2 oranges, a spoon, a dining table, and a person.

csail-semantic segmentation output(current model): wall, floor, ceiling, person, table mseg:'ceiling', 'food_other', 'table', 'counter_other', 'bottle', 'cup', 'bowl', 'person', 'paper', 'wall'

output of mseg Screen Shot 2022-03-26 at 12 37 54 PM

jeffbl commented 2 years ago

https://creativecommons.org/2016/08/02/open_building_institute/kitchen/ This picture of an alcove contains 6 apples 2 chairs 1 orange 1 book and 1 bottle. Expect: No oranges (what is it with oranges, when there is nothing the color orange in the pictures?), no apples, yes eggs, there are like 100 books, not just one. Bottle is so-so. Does not find tables.

after #209 20211213: This indoor picture contains 2 dining tables 5 books 3 chairs and 1 bottle.

after testing on a bigger model(yolov5s6): 7 books, 2 dining tables, 3 chairs, a couch, and a bottle.

csail-semantic segmentation output(current model): wall, floor, table, book, shelf mseg: 'ceiling', 'floor', 'apple', 'chair_other', 'armchair', 'couch', 'table', 'basket', 'bookshelf', 'door', 'shelf', 'book', 'bottle', 'cup', 'bowl', 'person', 'sports_ball', 'pillow', 'wall' Screen Shot 2022-03-26 at 12 26 44 PM

jeffbl commented 2 years ago

https://unsplash.com/photos/bT3dHRFAREA This picture of a plaza contains 8 people and 1 traffic light. EXPECT: Flags, people count is off (maybe we should say "about" or since it seems to undercount, "at least"?)

after #209 20211213: This picture contains 3 handbags 11 people and 2 traffic lights. after testing on a bigger model(yolov5s6): 11 people, 2 traffic lights, and a handbag.

csail-semantic segmentation output(current model): person, floor, building, sky, tree mseg: 'building', 'road', 'person', 'sky', 'vegetation'

output for mseg Screen Shot 2022-03-26 at 12 27 43 PM

jeffbl commented 2 years ago

https://creativecommons.org/2016/08/02/open_building_institute/modular-house/ This picture of a campsite contains 8 people and 1 surfboard. EXPECT: Construction site, board (not a surfboard). Ladder? Hardhats? Bricks?

after #209 20211213: This outdoor picture contains 7 people.

after testing on a bigger model(yolov5s6): 7 people

csail-semantic segmentation output(current model): wall, person, floor, sky ,tree mseg: building,floor,chair_other,armchair, table, stairs, sidewalk_pavement, terrain, person, bench, sky, baseball_bat, skateboard, vegetation, airplane, wall

output for mseg Screen Shot 2022-03-26 at 12 28 45 PM

jeffbl commented 2 years ago

https://api.creativecommons.engineering/v1/thumbs/aad5765c-a8a0-4d16-8dfa-d9f8c3e69155 This picture of a restaurant kitchen contains 4 chairs 2 refrigerators 2 ovens 2 microwaves 1 vase 1 bottle 1 remote 1 clock and 1 dining table. EXPECT: Finds multiples of appliances (maybe toaster oven and dishwasher are confusing?) I only see (and hear in the rendering) three chairs, not 4. I don't see any refrigerator. Others very good: remote, table, clock.

after testing on a bigger model(yolov5s6): the link is no longer valid

jeffbl commented 2 years ago

https://search.creativecommons.org/photos/ca32bd72-5e2b-4949-b641-8a9a13a1b6ad This picture of a restaurant kitchen contains 2 ovens 1 sink and 1 refrigerator. EXPECT: one oven, one dishwasher (yes, dishwashers look like ovens...) Would be nice to hear about cabinets or handles.

after #209 20211213: This indoor picture contains 2 ovens 1 sink and 1 refrigerator.

csail-semantic segmentation output(current model): request not getting sent to server

rohanakut commented 2 years ago

https://www.istockphoto.com/photo/new-york-city-asphalt-road-on-busy-intersection-streets-with-car-traffic-at-daytime-gm1133502463-300855877

after #209 : 4 people, 2 cars, 2 trucks, a bicycle, and a traffic light.

after testing on a bigger model(yolov5s6): 4 people, 2 cars, 2 trucks, a bicycle, and a traffic light.

csail-semantic segmentation output(current model): building, road mseg: 'building', 'road', 'sidewalk_pavement', 'person', 'sky', 'car', 'truck'

output for mseg Screen Shot 2022-03-26 at 12 30 04 PM

jeffbl commented 1 year ago

I am closing this as the last updates are from a year ago, and are likely no longer relevant.