Open maxarvid opened 6 years ago
"might ultimately not be interesting"
In that case, you just do a boring story! No big deal. You can also give up and grab another data set if you must.
.unstack().reset_index()
I think you'll be able to get your double-groupby into a grouped or stacked bar plot.Howdy! I'm a little robot, here for a surprise inspection.
Please post your first revision! It should be posted by Thursday at midnight. More details available here.
You need some feedback, let me summon @ElinaMak, @zle2105, @adrianblanco for you
I had a look at trips that were longer than a week...
This sounds really appealing to me. First, I'd be double-check that this is not an error in the data. Then, what about if you try to plot the main routes they have done when they have rented a bike from more than a week. Do you think would it be possible?
Also, I think routes can be interesting too. Which are the main ones? Which bike stations are not used very much?
Good luck!
I really like the focus on the most popular bike. It could be interesting to map out its route throughout the city (maybe in a later project where we can use maps).
Is there a way to get the stop and start latitude to give you the average distance of the trips?
if you are going to do the most popular bike, can you show me on a map where it has been used? or can you show me the top 10 most popular bikes mapped? this makes me wonder if it's because the bike is in a certain location or if its just 'luck'
i'd also be curious to know about the length of rentals. it never occurred to me that someone would rent a citibike for more than a trip. again, with this dataset you want to see that mapped if you are looking at how long the bike was out.
nice approach.
Interesting topic! Since you have coordinates of start and end points, could you make a map of those routes? That would be fun!
Bike 25738, hereby referred to as Phil, was the most popular Citibike in 2017. And they had quite the year:
Number of trips per day:
Duration of average trip per week:
Wow Phil! What's that weird outlier? What really did happen on August 10?
In the next exciting installment of Phil's 2017, you'll find out what we know about their 3 day hiatus from the system, who rode with Phil the most, and to be a bit original: what were the longest periods non-use?
The sum of time at rest by weeks of the year. Beginning of December was rough.
Yep, I'm dropping the investigation of who uses the bikeshare system and how. It's been done really thoroughly by Todd Schneider here. Instead I'm going ahead with the single bike.
While there are coordinate points available in the data for the start and stop stations, none is available for the actual routes. My inclination is to not use maps for this one, but could be convinced otherwise (Todd's leaflet map is super cool).
Pitch
CitibikeNYC has exploded in popularity. What stories do its data tell us? Is there a difference in how the system is used by New Yorkers and tourists?
Summary
Here the gender code for 1.0 signifies men, 2.0 women, and 0.0 unknown (I'll be dropping that one at a later stage). I could do a quick bar chart, using a similar color palette to the below inspiration:
Details
Possible headline(s): 2017 Bikeshare usage
Data set(s): https://s3.amazonaws.com/tripdata/index.html
Code repository: https://github.com/maxarvid/data-studio/tree/master/code/02_bike_share
Possible problems/fears/questions: I have a fear. It's that breaking down usage by subscriber and customer (New Yorker and tourist) and gender might ultimately not be interesting.
Work so far
I scraped the citibike data. It was far too big to work with on my machine (~15GB) so I decided to only look at 2017 (still clocks in at a decent 3GB). I used did some EDA to see if there are differences between subscribers and customers:
But also, and perhaps more interesting, I had a look at trips that were longer than a week (a citibike ride is free for up to 30 minutes/45 minutes depending on whether you're a subscriber or a customer). There were some 9000 of them. Here below a
.value_counts()
If this sounds more interesting, it might be a better direction for this project to pursue. Lastly, I also thought it might be fun to focus on the bikes instead of how people use them. What is the most popular bike in the bikeshare system? Meet bike No. 25738:
Checklist
This checklist must be completed before you submit your draft.