eskerda / pybikes

bike sharing + python = pybikes
https://citybik.es
GNU Lesser General Public License v3.0
551 stars 166 forks source link

standardized "extra" fields #155

Open rolinger opened 8 years ago

rolinger commented 8 years ago

I am very new to pybikes (and citybik.es). As I look through how the data is compiled is I see there is opportunity for better standardization of fields. I don't know how that impacts the current deployment or how much it breaks things (others use of the data) to change. But I do see the need.

The bigger area for better standardization is under the EXTRAs object, there shouldn't be 3 different ways of labeling an address, (ie: "address", "stAddress1", "") - some don't even have an address. It should simply be a standard "address1" "address2"...if no address1 or address2 exists, then there values should be set to null. Or station.extra.slots versus station.extra.totalDocks - there are plenty of key names that can be standardized

The same with other values that some vendors do provide that others don't. IE: station.extra.altitude or station.extra.landmark - if the value is present, great...if not, then it should be set to null.

This effort would be begin the process of moving towards a complete standard. I know this is easier said than done. IMO, I see pybikes/citybik.es eventually becoming the standard that everyone else must cater too - right now it seems the effort is catered around how each vendor comprises their individual data. That is not scalable. (If this is already in the works, by all means, please point me to these discussions)

bcaller commented 8 years ago

We definitely need this to standardise some of the extra fields. In my Pebble app I've had to add lot's of cases just to decide whether or not to list a station as active. It doesn't even cover everything.

function isActiveStation(station) {
    if(!station.empty_slots && !station.free_bikes) return false;
    if(station.empty_slots <= 0 && station.free_bikes <= 0) return false;
    if(!station.extra) return true;
    if("installed" in station.extra && !station.extra.installed) return false;
    if("status" in station.extra) {
        var status = station.extra.status.toUpperCase();
        if(status == 'CLOSED' || status == 'OFFLINE') return false;
    }
    if("locked" in station.extra && station.extra.locked) return false;
    if("testStation" in station.extra && station.extra.testStation) return false;
    if("statusValue" in station.extra && station.extra.statusValue == 'Not In Service') return false;
    if("online" in station.extra && station.extra.online === false) return false;

    return true;
}

and for address I use station.extra.address || station.extra.location || station.extra.stAddress2 || station.extra.description

rolinger commented 8 years ago

We also need a few more standard keys too (non "extra" fields.

There should be : vendorURL: 'http://www.bikevendor.com', vendorIOS: 'http://appstore.apple.com/vendorID=bikeVendor', vendorAndroid: 'http://playstore.appl.com/vendorID=bikeVendor', deeplinkIOS: 'bikevendor://?', deeplinkAndroid: 'com.bikevendor.android?'

I literally just went through and retrieved 40+ vendor URLs....ALL have apps, but I don't know if any of them have deeplinks. But to collect this data is researching 40+ vendor URLs, 40+ app store appIDs, 40+ playstore appIDs, 40+ ios deeplinks, 40+ android deeplinks (if either of those last two even exist).

This data should be coming from the api, provided by the individual vendors...not something we have to go track down and check every few days to see if there any new vendors and then manually go retrieve those new vendors information.

eskerda commented 8 years ago

Why exactly should the API provide links to iOS and Android apps exactly?

This data should be coming from the api, provided by the individual vendors...not something we have to go track down and check every few days to see if there any new vendors and then manually go retrieve those new vendors information.

Whatever metadata is wanted can be added to either the data files or the actual python implementation of the spider.

You have the repo here. Feel free to contribute anything you think the project "should be providing". I am really sorry you have to go track down and check every few days what new networks have been added. I guess it's a lot of work compared to just running the project. If you know better, feel free to create your own project, the fork button is just on top of this page.

rolinger commented 8 years ago

my apologies...not trying to offend here. I guess I don't fully understand how all the data is being assembled. My assumption was a vendor would register the api site and this project would pull from their own listings. Thus it would be (easier) to have the vendors add in or standardize the fields to meet this api's requirements. From your comments though, I think my above assumption is wrong.

I have assembled all the above data. I would be happy to contribute in all my data...I guess the initial manual data grab sucks, but after that it should be easy to maintain or easy to update with new vendors.

For the links to iOS/Android apps reason: my app displays bike data for any city if a bike share exists in that city, but a user doesn't have the bike vendors app, or know where to go get it, then it becomes a more complicated process for the user. If we are able to readily provide them those links...or check to see if the app exists...then (as an example) my app can direct them to where they need to go or auto-launch that vendors app (assuming the user already has it).

For the record...this is a GREAT project and I can tell a lot of work has gone into it. I am glad to help in anyway I can.

eskerda commented 8 years ago

Going to quote on https://github.com/eskerda/pybikes/issues/180

This ticket is about standardizing extra fields.