murraycu / android-galaxyzoo

This Android app lets you classify Galaxy Zoo subjects. It is available in the Google Play Store: https://play.google.com/store/apps/details?id=com.murrayc.galaxyzoo.app . Try beta versions early here: https://play.google.com/apps/testing/com.murrayc.galaxyzoo.app . See also the iPhone app for Galaxy Zoo: https://github.com/murraycu/ios-galaxyzoo/
GNU General Public License v3.0
11 stars 13 forks source link

Identify classifications as coming from the app #11

Closed willettk closed 9 years ago

willettk commented 9 years ago

From a data analysis standpoint, we'd like to know which classifications users made through the app as opposed to the browser. Can you add a field to the final upload that indicates this? Something like:

{
interface: "android_app"
}

or equivalent would be fine.

@camallen suggested that it'd probably be done here:

https://github.com/murraycu/android-galaxyzoo/blob/f0db9ff02695a835369c20098bae6220e97ecfc4/app/src/main/java/com/murrayc/galaxyzoo/app/syncadapter/SyncAdapter.java#L322

murraycu commented 9 years ago

Gladly: https://github.com/murraycu/android-galaxyzoo/commit/8b7ac85eb84da614d98484f0db50e11707d32af3

I've use the existing User-Agent string instead of just "android-app" because it seems more specific in case there are ever more active apps. People might think they should reuse "android-app" but they wouldn't reuse someone else's domain.

So a classification now looks like this (in the POST's content):

interface:murrayc.com-android-galaxyzoo
classification[subject_ids][]:504e57f9c499611ea6019474
classification[annotations][0][sloan-0]:a-1
classification[annotations][1][sloan-1]:a-1
classification[annotations][2][sloan-2]:a-1
classification[annotations][3][sloan-3]:a-0
classification[annotations][4][sloan-9]:a-1
classification[annotations][5][sloan-10]:a-1
classification[annotations][6][sloan-4]:a-1
classification[annotations][7][sloan-5]:a-0
classification[annotations][8][sloan-6]:a-0
classification[annotations][8][sloan-6]:x-4
classification[annotations][9][sloan-11]:a-1

However, maybe you'd prefer it to be classification[interface] instead of interface?

For existing classifications, you can also use the User-Agent of the HTTP Post, if that gets through to your database: https://github.com/murraycu/android-galaxyzoo/blob/f0db9ff02695a835369c20098bae6220e97ecfc4/app/src/main/java/com/murrayc/galaxyzoo/app/provider/HttpUtils.java#L56

brian-c commented 9 years ago

Galaxy Zoo is using an old branch of the main Zooniverse library, so it has its own Classification model which doesn't include the user agent.

Usually it's saved as an annotation (not a great place, but it's stuck for now) like this: https://github.com/zooniverse/Zooniverse/blob/master/src/models/classification.coffee#L91-L93

Which I think ends up looking like this:

. . .
classification[annotations][9][sloan-11]:a-1
classification[annotations][10][user_agent]:murrayc.com-android-galaxyzoo

Nothing outside the top-level classification key is stored.

willettk commented 9 years ago

I agree that storing user agent annotations are annoying (especially for RGZ; I hate them so much), but I suppose we'll deal with it as long as it's stored somewhere.

On Fri Dec 05 2014 at 8:41:08 AM Brian Carstensen notifications@github.com wrote:

Galaxy Zoo is using an old branch of the main Zooniverse library, so it has its own Classification model which doesn't include the user agent.

Usually it's saved as an annotation (not a great place, but it's stuck for now) like this: https://github.com/zooniverse/Zooniverse/blob/master/src/ models/classification.coffee#L91-L93

Which I think ends up looking like this:

. . . classification[annotations][9][sloan-11]:a-1 classification[annotations][10][user_agent]:murrayc.com-android-galaxyzoo

Nothing outside the top-level classification key is stored.

— Reply to this email directly or view it on GitHub https://github.com/murraycu/android-galaxyzoo/issues/11#issuecomment-65798029 .

murraycu commented 9 years ago

So I should add this parameter instead of the "interface" thing?

classification[annotations][the-last-number][user_agent]:murrayc.com-android-galaxyzoo

willettk commented 9 years ago

Think so. Will that work, @brian-c ?

On Fri, Dec 5, 2014, 13:56 Murray Cumming notifications@github.com wrote:

So I should add this parameter instead of the "interface" thing?

classification[annotations][the-last-number][user_agent]:murrayc.com-android-galaxyzoo

— Reply to this email directly or view it on GitHub https://github.com/murraycu/android-galaxyzoo/issues/11#issuecomment-65845286 .

brian-c commented 9 years ago

That should be fine. @willettk are you working with the data directly from Mongo? I though everybody got a nice CSV with labelled columns.

willettk commented 9 years ago

For GZ, yes. I work with the mongo data from radio galaxy zoo, where user agent data is much more of a pain.

On Fri, Dec 5, 2014, 15:13 Brian Carstensen notifications@github.com wrote:

That should be fine. @willettk https://github.com/willettk are you working with the data directly from Mongo? I though everybody got a nice CSV with labelled columns.

— Reply to this email directly or view it on GitHub https://github.com/murraycu/android-galaxyzoo/issues/11#issuecomment-65855381 .

murraycu commented 9 years ago

OK: https://github.com/murraycu/android-galaxyzoo/commit/d02544875ea9cf64129a3544c70bae8b67334219

Here are the content parameters from an example classification POST:

classification[subject_ids][]:504e4706c499611ea600d59d
classification[favorite][]:true
classification[annotations][0][sloan-0]:a-1
classification[annotations][1][sloan-1]:a-1
classification[annotations][2][sloan-2]:a-0
classification[annotations][3][sloan-3]:a-0
classification[annotations][4][sloan-9]:a-1
classification[annotations][5][sloan-10]:a-5
classification[annotations][6][sloan-4]:a-1
classification[annotations][7][sloan-5]:a-1
classification[annotations][8][sloan-11]:a-1
classification[annotations][9][interface]:murrayc.com-android-galaxyzoo

I've uploaded a couple of classifications already so you can check it on the server.

brian-c commented 9 years ago

Sorry, that should be "user_agent", not "interface".

murraycu commented 9 years ago

Thanks for checking. Done: https://github.com/murraycu/android-galaxyzoo/commit/7b3ee22559e7df3fa727f5610202cb95bfaadf32

And I've uploaded some more classifications to test that.

murraycu commented 9 years ago

Could you please confirm that this is working for you?

willettk commented 9 years ago

There are several classifications that now have the annotation marked as coming from Android. However, there are only 13 in the entire sample - I would have expected much more than that if you've deployed this to the full audience.

Also: the new classification document has additional data normally associated with the subject (example below). I don't know if it's a problem, but it does bulk up the data products somewhat unnecessarily - I thought normally that one would get that data by linking to the subject_id. Is there a particular reason that it's been added, @brian-c or @murraycu?

> db.galaxy_zoo_classifications.findOne({'annotations.interface':{$exists:true}})
{
    "_id" : ObjectId("5482b4c227b56239a200000c"),
    "annotations" : [
        {
            "sloan-0" : "a-1"
        },
        {
            "sloan-1" : "a-0"
        },
        {
            "sloan-8" : "a-1"
        },
        {
            "sloan-5" : "a-1"
        },
        {
            "sloan-11" : "a-1"
        },
        {
            "interface" : "murrayc.com-android-galaxyzoo"
        }
    ],
    "created_at" : ISODate("2014-12-06T07:48:18Z"),
    "favorite" : [
        "true"
    ],
    "project_id" : ObjectId("502a90cd516bcb060c000001"),
    "subject_ids" : [
        ObjectId("504e5d62c499611ea601b902")
    ],
    "subjects" : [
        {
            "id" : ObjectId("504e5d62c499611ea601b902"),
            "zooniverse_id" : "AGZ0002f42",
            "location" : {
                "standard" : "http://www.galaxyzoo.org.s3.amazonaws.com/subjects/standard/1237663229070082210.jpg",
                "thumbnail" : "http://www.galaxyzoo.org.s3.amazonaws.com/subjects/thumbnail/1237663229070082210.jpg",
                "inverted" : "http://www.galaxyzoo.org.s3.amazonaws.com/subjects/inverted/1237663229070082210.jpg"
            },
            "coords" : [
                279.399026983604,
                78.0015454472869
            ],
            "metadata" : {
                "counters" : {
                    "feature" : 24,
                    "smooth" : 13,
                    "star" : 13
                }
            }
        }
    ],
    "tutorial" : false,
    "updated_at" : ISODate("2014-12-06T07:48:02.702Z"),
    "user" : {
        "classification" : "feature"
    },
    "user_ip" : "88.217.180.214",
    "workflow_id" : ObjectId("50251c3b516bcb6ecb000002")
}
murraycu commented 9 years ago

There are several classifications that now have the annotation marked as coming from Android.

Good, so it's basically working.

However, there are only 13 in the entire sample

I bet most of them are me testing.

I would have expected much more than that if you've deployed this to the full audience.

There are still not that many people using it, and I'd expect a lot of people to install it and forget about it after playing with it. Broadly, about 100 people have had the new version for about a week, during a time when there weren't many new installs. I'd be fascinated to know what the visitor retention numbers are for the website.

Also: the new classification document has additional data normally associated with the subject (example below).

Is this specific to the classifications from the app? If so, then I think someone would have to look at what's happening on the server to trigger a change in behaviour.

willettk commented 9 years ago

My bad - virtually all classifications (except for the first couple) do have that subject data in the classification document, so that's nothing specific to the app.

camallen commented 9 years ago

@willettk - re numbers of classifications, check for the user_agent key as per https://github.com/murraycu/android-galaxyzoo/issues/11#issuecomment-65890163 and https://github.com/murraycu/android-galaxyzoo/commit/7b3ee22559e7df3fa727f5610202cb95bfaadf32

willettk commented 9 years ago

Ah - thanks, @camallen! Looks like we have 5006 classifications so far from the app; much more like what I'd thought.

On Wed Dec 17 2014 at 4:13:46 PM Campbell Allen notifications@github.com wrote:

@willettk https://github.com/willettk - re numbers of classifications, check for the user_agent key as per #11 (comment) https://github.com/murraycu/android-galaxyzoo/issues/11#issuecomment-65890163 and 7b3ee22 https://github.com/murraycu/android-galaxyzoo/commit/7b3ee22559e7df3fa727f5610202cb95bfaadf32

— Reply to this email directly or view it on GitHub https://github.com/murraycu/android-galaxyzoo/issues/11#issuecomment-67404869 .

camallen commented 9 years ago

Good stuff! Out of interest how many unique classifying app users?

willettk commented 9 years ago

46 so far. Embarrassing that I'm apparently not among them; guess I haven't logged back in since the latest update.

db.galaxy_zoo_classifications.distinct('user_name',{'annotations.user_agent':{$exists:true}})
willettk commented 9 years ago

About 27% (1354 classifications) on the app are from non-logged in users so far. 23 different countries, too (most are from Germany, UK, or the US).

murraycu commented 9 years ago

Looks like we have 5006 classifications

It's good to know people are using it. Thanks.

So, I think this is done. Closing.