zooniverse / planet-four

Identify and measure features on the surface of Mars
https://www.planetfour.org/
Apache License 2.0
2 stars 0 forks source link

0.18 % of all data carrying default values #105

Closed michaelaye closed 9 years ago

michaelaye commented 10 years ago

I have identified 5764 fan classifications (0.172 %) and 7365 blotch classifications (0.177 %) that carry the signature of default values at coordinates (0,0)!, that in my opinion are hard to create on purpose due to the inherent difference between mouse cursor position and exact position where a marking is positioned at mouse-click (i.e. they are several pixel apart) I found the below linked stacks of classifications by doing a search for

Here is a plot over time, showing how many of these have been found with the same classification_id. This is also the meaning of the number in below linked files: It's how many entries for this classification_id carry these default data values. Default data over time

Sometimes the same user creates these kind of classifications with a different classification_id for the same image_id, maybe related or just a problem on top of the #100 issue.

Here are the files with the metadata for these kind of classifications:

https://www.dropbox.com/s/zo046s9vsv7e918/default_data_fans.csv https://www.dropbox.com/s/ukd1xwo1ujzz4hg/default_data_blotches.csv

mschwamb commented 10 years ago

@chrissnyder this is something we'd like to talk about on the telecon. Is this something you can take a look at for the call next week? Thanks, ~Meg

mschwamb commented 10 years ago

Is this something @brian-c you could answer?

brian-c commented 10 years ago

If it's the same users, I'd guess it's a browser issue. Does your data come with the userAgent field from the classifications?

michaelaye commented 10 years ago

No, it doesn't. These are our fields:

"classification_id","created_at","image_id","image_name","image_url","user_name","marking","x_tile","y_tile","acquisition_date","local_mars_time","x","y","image_x","image_y","radius_1","radius_2","distance","angle","spread"
michaelaye commented 10 years ago

@parrish Maybe we could have the userAgent field in a database dump, then I could have a look myself if this is browser related?

mschwamb commented 10 years ago

@parrish and @chrissnyder if this is a simple query with the saved mongo db copy of the classification database, can you send us the mongo db? Or if you wouldn't mind could you make the query yourself and see if there is a single browser associated with the 0,0 markings, we'd appreciate it.

mschwamb commented 10 years ago

talked to Chris Lintott. I have a suggested plan for moving forward with dealing with these. He also suggested a few checks to do with to verify that it's likely a java script dropout. I will summarize on the next science telecon

parrish commented 10 years ago

I've added the user agent field to the data export. You should've received an email, but I'll send around a link just in case.

mschwamb commented 9 years ago

As per the December telecon, I think since the December telecon. we've sorted this as a script drop and able to remove them. I've written up what i discussed on the telecon in the current paper draft. Can we close this ticket for bookkeeping?

Thanks.