gwu-libraries / social-feed-manager

"Old SFM" -- manage rules and streams from social data sources, starting with twitter.
MIT License
87 stars 20 forks source link

Ambiguity of meaning in retweet_count data #99

Closed patrickmj closed 10 years ago

patrickmj commented 10 years ago

Just noticed this at random. In the data from my tweets that is pulled in and displayed, for one of my tweets, the RT count listed is 142. But, it looks like those aren't RT of my tweet, it's a full count of RTs from the original tweet. And, I think that my tweet (my RT), isn't counted as one of my tweets.

I follow what I think my status id should be: https://twitter.com/patrick_mj/status/408697937083269100, and get 404.

The data in the retweeted_status object seems to be getting the privilege, which is where the RT count seems to be coming from. That means that the count might not be count of people RTing me, it's the total count of RTs on the original tweet. depending on the network analysis taking place, it seems like that distinction (if I'm following it right) will make a difference to the interpretation of the data

Here's the JSON:


{

    "contributors": null,
    "truncated": false,
    "text": "RT @cstross: WTF? A Markov chain trained on the King James Bible and Structure and Interpretation of Computer Programs: http://t.co/VgT7MHN…",
    "in_reply_to_status_id": null,
    "id": 408697937083269100,
    "favorite_count": 0,
    "source": "<a href=\"http://twicca.r246.jp/\" rel=\"nofollow\">twicca</a>",
    "retweeted": true,
    "coordinates": null,
    "entities": {
        "symbols": [ ],
        "user_mentions": [
            {
                "id": 390039185,
                "indices": [
                    3,
                    11
                ],
                "id_str": "390039185",
                "screen_name": "cstross",
                "name": "Charlie Stross"
            }
        ],
        "hashtags": [ ],
        "urls": [ ]
    },
    "in_reply_to_screen_name": null,
    "id_str": "408697937083269120",
    "retweet_count": 142,
    "in_reply_to_user_id": null,
    "favorited": false,
    "retweeted_status": {
        "contributors": null,
        "truncated": false,
        "text": "WTF? A Markov chain trained on the King James Bible and Structure and Interpretation of Computer Programs: http://t.co/VgT7MHNw5Q",
        "in_reply_to_status_id": null,
        "id": 408681349948928000,
        "favorite_count": 66,
        "source": "<a href=\"http://tapbots.com/software/tweetbot/mac\" rel=\"nofollow\">Tweetbot for Mac</a>",
        "retweeted": true,
        "coordinates": null,
        "entities": {
            "symbols": [ ],
            "user_mentions": [ ],
            "hashtags": [ ],
            "urls": [
                {
                    "url": "http://t.co/VgT7MHNw5Q",
                    "indices": [
                        107,
                        129
                    ],
                    "expanded_url": "http://kingjamesprogramming.tumblr.com/",
                    "display_url": "kingjamesprogramming.tumblr.com"
                }
            ]
        },
        "in_reply_to_screen_name": null,
        "id_str": "408681349948928000",
        "retweet_count": 142,
        "in_reply_to_user_id": null,
        "favorited": false,
        "user": {
            "follow_request_sent": false,
            "profile_use_background_image": true,
            "default_profile_image": false,
            "id": 390039185,
            "verified": false,
            "profile_text_color": "333333",
            "profile_image_url_https": "https://pbs.twimg.com/profile_images/378800000712055382/6d1ca01eeb0031136dd6d7a920d0b1d0_normal.jpeg",
            "profile_sidebar_fill_color": "DDEEF6",
            "entities": {
                "url": {
                    "urls": [
                        {
                            "url": "http://t.co/XKTE5uejvd",
                            "indices": [
                                0,
                                22
                            ],
                            "expanded_url": "http://www.antipope.org/charlie/blog-static/",
                            "display_url": "antipope.org/charlie/blog-s…"
                        }
                    ]
                },
                "description": {
                    "urls": [ ]
                }
            },
            "followers_count": 18847,
            "profile_sidebar_border_color": "C0DEED",
            "id_str": "390039185",
            "profile_background_color": "C0DEED",
            "listed_count": 892,
            "profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme1/bg.png",
            "utc_offset": 0,
            "statuses_count": 14383,
            "description": "I tell lies for money.\n...\n\nAlso: normality is overrated.",
            "friends_count": 251,
            "location": "",
            "profile_link_color": "0084B4",
            "profile_image_url": "http://pbs.twimg.com/profile_images/378800000712055382/6d1ca01eeb0031136dd6d7a920d0b1d0_normal.jpeg",
            "following": false,
            "geo_enabled": true,
            "profile_background_image_url": "http://abs.twimg.com/images/themes/theme1/bg.png",
            "screen_name": "cstross",
            "lang": "en",
            "profile_background_tile": false,
            "favourites_count": 3,
            "name": "Charlie Stross",
            "notifications": false,
            "url": "http://t.co/XKTE5uejvd",
            "created_at": "Thu Oct 13 11:15:11 +0000 2011",
            "contributors_enabled": false,
            "time_zone": "Edinburgh",
            "protected": false,
            "default_profile": true,
            "is_translator": false
        },
        "geo": null,
        "in_reply_to_user_id_str": null,
        "possibly_sensitive": false,
        "lang": "en",
        "created_at": "Thu Dec 05 19:36:41 +0000 2013",
        "in_reply_to_status_id_str": null,
        "place": null
    },
    "user": {
        "follow_request_sent": false,
        "profile_use_background_image": true,
        "default_profile_image": false,
        "id": 6114332,
        "verified": false,
        "profile_text_color": "000000",
        "profile_image_url_https": "https://pbs.twimg.com/profile_images/1126458706/screwed-squared_normal.jpg",
        "profile_sidebar_fill_color": "E0FF92",
        "entities": {
            "url": {
                "urls": [
                    {
                        "url": "http://t.co/yAU920efcY",
                        "indices": [
                            0,
                            22
                        ],
                        "expanded_url": "http://hackingthehumanities.org",
                        "display_url": "hackingthehumanities.org"
                    }
                ]
            },
            "description": {
                "urls": [ ]
            }
        },
        "followers_count": 1531,
        "profile_sidebar_border_color": "87BC44",
        "id_str": "6114332",
        "profile_background_color": "9AE4E8",
        "listed_count": 157,
        "profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme1/bg.png",
        "utc_offset": -18000,
        "statuses_count": 21071,
        "description": "Omeka dev team manager. Hacker, humanist at Roy Rosenzweig Center for History and New Media. Plays well with RDF, Drupal, Omeka, WordPress, Anthologize",
        "friends_count": 785,
        "location": "",
        "profile_link_color": "0000FF",
        "profile_image_url": "http://pbs.twimg.com/profile_images/1126458706/screwed-squared_normal.jpg",
        "following": false,
        "geo_enabled": false,
        "profile_background_image_url": "http://abs.twimg.com/images/themes/theme1/bg.png",
        "screen_name": "patrick_mj",
        "lang": "en",
        "profile_background_tile": false,
        "favourites_count": 348,
        "name": "Patrick Murray-John",
        "notifications": false,
        "url": "http://t.co/yAU920efcY",
        "created_at": "Thu May 17 17:13:41 +0000 2007",
        "contributors_enabled": false,
        "time_zone": "Eastern Time (US & Canada)",
        "protected": false,
        "default_profile": false,
        "is_translator": false
    },
    "geo": null,
    "in_reply_to_user_id_str": null,
    "lang": "en",
    "created_at": "Thu Dec 05 20:42:36 +0000 2013",
    "in_reply_to_status_id_str": null,
    "place": null

}
lwrubel commented 10 years ago

The retweet count is indeed a count of retweets of the original tweet. Even if someone retweets your retweet, it gets credited to the retweet count of the original tweet. Twitter's API docs say (https://dev.twitter.com/docs/faq#22233): "While it is possible for a user to retweet or favorite a retweet (that is, the outer-most object instead of the retweeted_status), the end result object created will be another retweet of the original status."

Note that this would only apply where someone uses the Twitter-supported retweet feature. If they do an old-style retweet, just typing RT, then the retweet_count would not reflect the original tweet, but that of the retweeter's tweet.

I think you're right that we should improve the documentation about this. I'll add info about this to our data dictionary which will be going up on the wiki soon.

@patrickmj I'm not sure what you meant about your retweet not being counted as one of your tweets. Could you explain more, if the info above doesn't cover it?

The link to your status didn't work because it's using id instead of id_str. JSON doesn't handle large integers well, so that id is not right. (more info: https://dev.twitter.com/docs/twitter-ids-json-and-snowflake) The link to your tweet would be:

https://twitter.com/patrick_mj/status/408697937083269120

Noted as issue #128 to make sure we're using the right one everywhere in SFM.