jsoma / data-studio-projects

12 stars 18 forks source link

[Project] NYC bikeshare usage #185

Open maxarvid opened 6 years ago

maxarvid commented 6 years ago

Pitch

CitibikeNYC has exploded in popularity. What stories do its data tell us? Is there a difference in how the system is used by New Yorkers and tourists?

Summary

skarmavbild 2018-07-17 kl 08 44 03

Here the gender code for 1.0 signifies men, 2.0 women, and 0.0 unknown (I'll be dropping that one at a later stage). I could do a quick bar chart, using a similar color palette to the below inspiration:

hazdnrprrjbq4ofxnu4sygx0pnhmo8ptdvnfnyhxltq

Details

Possible headline(s): 2017 Bikeshare usage

Data set(s): https://s3.amazonaws.com/tripdata/index.html

Code repository: https://github.com/maxarvid/data-studio/tree/master/code/02_bike_share

Possible problems/fears/questions: I have a fear. It's that breaking down usage by subscriber and customer (New Yorker and tourist) and gender might ultimately not be interesting.

Work so far

I scraped the citibike data. It was far too big to work with on my machine (~15GB) so I decided to only look at 2017 (still clocks in at a decent 3GB). I used did some EDA to see if there are differences between subscribers and customers:

skarmavbild 2018-07-17 kl 08 43 55

But also, and perhaps more interesting, I had a look at trips that were longer than a week (a citibike ride is free for up to 30 minutes/45 minutes depending on whether you're a subscriber or a customer). There were some 9000 of them. Here below a.value_counts()

skarmavbild 2018-07-17 kl 09 00 42

If this sounds more interesting, it might be a better direction for this project to pursue. Lastly, I also thought it might be fun to focus on the bikes instead of how people use them. What is the most popular bike in the bikeshare system? Meet bike No. 25738:

skarmavbild 2018-07-17 kl 09 03 17 skarmavbild 2018-07-17 kl 09 03 01

Checklist

This checklist must be completed before you submit your draft.

jsoma commented 6 years ago

"might ultimately not be interesting"

In that case, you just do a boring story! No big deal. You can also give up and grab another data set if you must.

playfairbot commented 6 years ago

Howdy! I'm a little robot, here for a surprise inspection.

Please post your first revision! It should be posted by Thursday at midnight. More details available here.

You need some feedback, let me summon @ElinaMak, @zle2105, @adrianblanco for you

adrianblanco commented 6 years ago

I had a look at trips that were longer than a week...

This sounds really appealing to me. First, I'd be double-check that this is not an error in the data. Then, what about if you try to plot the main routes they have done when they have rented a bike from more than a week. Do you think would it be possible?

Also, I think routes can be interesting too. Which are the main ones? Which bike stations are not used very much?

Good luck!

zle2105 commented 6 years ago

I really like the focus on the most popular bike. It could be interesting to map out its route throughout the city (maybe in a later project where we can use maps).

Is there a way to get the stop and start latitude to give you the average distance of the trips?

sarahslo commented 6 years ago

if you are going to do the most popular bike, can you show me on a map where it has been used? or can you show me the top 10 most popular bikes mapped? this makes me wonder if it's because the bike is in a certain location or if its just 'luck'

i'd also be curious to know about the length of rentals. it never occurred to me that someone would rent a citibike for more than a trip. again, with this dataset you want to see that mapped if you are looking at how long the bike was out.

nice approach.

dz2383 commented 6 years ago

Interesting topic! Since you have coordinates of start and end points, could you make a map of those routes? That would be fun!

maxarvid commented 6 years ago

Update

Your project content:

Bike 25738, hereby referred to as Phil, was the most popular Citibike in 2017. And they had quite the year:

Number of trips per day:

skarmavbild 2018-07-31 kl 16 13 42

Duration of average trip per week:

skarmavbild 2018-07-31 kl 16 15 02

Wow Phil! What's that weird outlier? What really did happen on August 10?

In the next exciting installment of Phil's 2017, you'll find out what we know about their 3 day hiatus from the system, who rode with Phil the most, and to be a bit original: what were the longest periods non-use?

The sum of time at rest by weeks of the year. Beginning of December was rough.

skarmavbild 2018-07-31 kl 16 25 14

Any changes in direction or topic?

Yep, I'm dropping the investigation of who uses the bikeshare system and how. It's been done really thoroughly by Todd Schneider here. Instead I'm going ahead with the single bike.

Problems/Questions

While there are coordinate points available in the data for the start and stop stations, none is available for the actual routes. My inclination is to not use maps for this one, but could be convinced otherwise (Todd's leaflet map is super cool).

Checklist