PokemonGoers / PokeData

In this project you will scrape as much data as you can get about the *actual* sightings of Pokemons. As it turns out, players all around the world started reporting sightings of Pokemons and are logging them into a central repository (i.e. a database). We want to get this data so we can train our machine learning models. You will of course need to come up with other data sources not only for sightings but also for other relevant details that can be used later on as features for our machine learning algorithm (see Project B). Additional features could be air temperature during the given timestamp of sighting, location close to water, buildings or parks. Consult with Pokemon Go expert if you have such around you and come up with as many features as possible that describe a place, time and name of a sighted Pokemon. Another feature that you will implement is a twitter listener: You will use the twitter streaming API (https://dev.twitter.com/streaming/public) to listen on a specific topic (for example, the #foundPokemon hashtag). When a new tweet with that hashtag is written, an event will be fired in your application checking the details of the tweet, e.g. location, user, time stamp. Additionally, you will try to parse formatted text from the tweets to construct a new “seen” record that consequently will be added to the database. Some of the attributes of the record will be the Pokemon's name, location and the time stamp. Additional data sources (here is one: https://pkmngowiki.com/wiki/Pok%C3%A9mon) will also need to be integrated to give us more information about Pokemons e.g. what they are, what’s their relationship, what they can transform into, which attacks they can perform etc.
Apache License 2.0
9 stars 6 forks source link

Json field name missmatch #110

Closed Aurel-Roci closed 8 years ago

Aurel-Roci commented 8 years ago

Hi guys, I noticed that the pokemonId field has two different names: pokemonId pokemonID Can you fix this?

Thanks

swathi-ssunder commented 8 years ago

@Aurel-Roci - Hey, I had noticed this issue on 31st and fixed it. So if you tested this on 31st, then yes, this discrepancy existed. See https://github.com/PokemonGoers/PokeData/commit/e59655475c1538e6261b558b1bfc6520e4b35b3e

I also tested the api responses now and could not find the discrepancy. It should be pokemonID everywhere. So could please tell me where you are getting pokemonId?

Aurel-Roci commented 8 years ago

@swathi-ssunder On the last 600 entries it is pokemonId. Also right before these 600 there are 7 entries the have no location coordinates.

swathi-ssunder commented 8 years ago

@Aurel-Roci - Thanks, found them. It should be fixed now. Regarding the entries without location coordinates, they would be from Twitter,for which location is not available. @samitsv

Aurel-Roci commented 8 years ago

@swathi-ssunder Hey, sorry to bother again, but there is something else that is causing us some problems. The json property order changes e.g.:

{
        "_id": "57c92ff6f0d1ffd702ba9641",
        "appearedOn": "2016-09-02T07:38:48.000Z",
        "pokemonID": 41,
        "source": "SKIPLAGGED",
        "__v": 0,
        "location": {
            "coordinates": [
                -84.242657,
                9.914319
            ],
            "type": "Point"
        }
    },
    {
        "_id": "57c930554e3bd9e102471781",
        "source": "TWITTER",
        "pokemonID": 1,
        "appearedOn": "2016-09-02T07:55:01.307Z",
        "__v": 0,
        "location": null
    },

Can you fix this?

swathi-ssunder commented 8 years ago

@Aurel-Roci - There is no need to be sorry:)

Will try to get this fixed. However, since it is JSON and not an array(where indices do matter), it shouldn't really matter from the point of view of accessing the data. Just curious to know how this affects !!

jonas-he commented 8 years ago

A JS Object is an "unordered set of key/value pairs". "You cannot and should not rely on the ordering of elements within a JSON object." http://stackoverflow.com/questions/3948206/json-order-mixed-up

Aurel-Roci commented 8 years ago

@swathi-ssunder Since we are converting this data into .arff if the order of the entries is different the program we are using for ML (Weka) will not open the file.

@jonas-he I read when I was trying to fix the order on my part, but still I think you should be able to store the order of the fields, since for most of the part the order is the same.

sacdallago commented 8 years ago

Sorry for the super late jumpin, can you make it pokemonId with small d? The capital D doesn't make sense in camel-case :) And yes, there's no ordering in the attributes inside an obj :) not even sure all values have all'attributes, that's why it's non-sql

fabe85 commented 8 years ago

Just created a new branch called "pokemonIdRenaming" to make this fix.

swathi-ssunder commented 8 years ago

Code changes and also db migrations have been done. @Aurel-Roci , @sacdallago - The field is now called pokemonId

Refer #113 and #114