Open pavithraes opened 1 year ago
Hi, i was approved for the initial stage of Outreachy. i noticed the datasets are in parquet format. I need some clarity and guidance, can bokeh read the parquet files directly? I was able to read them using pandas.pd. @pavithraes
Hello @pavithraes In this issue, we will mainly focus on cleaning and preprocessing the data as well as visualizing the data using Bokeh with as many important plots as possible??
Hello @pavithraes , I am exploring this data for the project Create a Blog Post Series - " Fundamentals of Data visualization in Bokeh ". I will use some python libraries to summarize and analyze the data after Performing tasks like Data Wrangling and Data processing to visualize it as per project requirements and then I'll use it on project.
Hi @robinokwanma, open the link to the dataset website, scroll down, and you will see a hyperlink; "Working with PARAQUET format" right under the "Data Dictionary and MetaData" subtitle. There are details on how to work with the format in there and full details in the "trip record user guide".
Thank you @anisheremariam . I'm taking a look now
Hi @pavithraes @anisheremariam Does this work? https://gist.github.com/robinokwanma/cc81d1a9f491377f963216848c036d26
That's the link to my githubgist on this microtask. Please review
@pavithraes started with this https://gist.github.com/Soot3/9eaf170fa2048e373e05046222350f54
@anisheremariam thank you for answering the question @robinokwanma, I was about to ask the same question.
@pavithraes I realized that the dataset was done on monthly basis, can someone download more than one month's dataset for the exploration?
@robinokwanma, I opened the file and noticed that although the code seems fine, some variables are wrongly placed. Please fix that.
@oluwaseun-tech, you are most welcome. Each month has over a million rows of data, if you are sure you can handle multiple months, it's great, but I suggest you use a subset of the data. That is just my opinion.
Thank's i have made the changes.
Oh! Okay thank you
@pavithraes please take a look at what have done so far https://gist.github.com/oluwaseun-tech/ef413dd9658b2123bfc7240652bae90b
Hello @pavithraes @anisheremariam Here is my work on the analysis of NYC Taxi data on Jupyter Notebook. I have also attached pictures of the output after the codes. Reviews would be appreciated.
https://github.com/BhaswatiRoy/Data-Analysis-Projects/tree/main/Bokeh_Plots
robinokwanma
I had this similar problem. You can use pd.read_parquet to load the dataset.
Hi @BhaswatiRoy, your choice of visualizations is really cool.
Hi @JoyclynUjunwaOgbonna, have you been able to solve that via the solutions I suggested earlier?
Hi @BhaswatiRoy, your choice of visualizations is really cool.
thanks @anisheremariam for the feedback, I am on my way to adding more visualizations!
That is perfect @BhaswatiRoy
@pavithraes, the link to my work on NYC Data Exploration on GitHub gist is below: https://gist.github.com/anisheremariam/e5f4cb9f46f05f7ba5aa35d449922f53 I appreciate any reviews and comment on it. Thank you
Hello @pavithraes, @bryevdv, everyone. I have an issue. My lineplot does not display as expected. If you look at it, you will see that it does not plot as expected. What can I do? Here is the link to my notebook: https://www.kaggle.com/faithnchifor/nyc-trips-viz
@Faith-Nchifor the link to your notebook is showing a 404 error-"I can't find this page". This usually happens when your kaggle notebook is on private. Could you check if your notebook is on private? If it is, you might want to make it public so people can access it.
I'm sorry about that @JoyclynUjunwaOgbonna . It's now public
@Faith-Nchifor, I think it is the method you used. The chart followed the irregular fitting of the index. Would you consider using the groupby
method;
@anisheremariam your method is good. I realized that my map behaved the way it did because I never sorted the data. It looks just like this one now. Thanks for your input
Hello @bryevdv, @pavithraes Here is the link to my gist: https://gist.github.com/Faith-Nchifor/b57ee2140e2dd1ea110d5f17c54626ee My project interest is Create a blog post series: "Fundamentals of Data Visualization in Bokeh"
Hi @Faith-Nchifor well done
Hi @BhaswatiRoy nice analysis and you choice of visualization is really great
If you are having any challenges regarding the project, ask on this channel. I will be of great help to assist anyone
thanks @Ajoke23 for the reviews
Hello, @pavithraes @anisheremariam please take a look at my first assignment on the analysis of NYC Taxi data on Jupyter Notebook. https://gist.github.com/Isaakkamau/358d2ccff3612d95496972fa67842021
Hello everyone ,my name is Anushka Sharma and I have made my contribution in bokeh#1 project @pavithraes @bryevdv please have a look at my assignment Here is my gist link https://gist.github.com/anushka-png/ffd9d83d2b6b46d169c5e510dc4123d9
I have tried to work with two different datasets first one is TLC Driver 24 hour course and second one is yellow taxi dataset for the month oct and nov . Also for the reference , have attached a pdf containing my outputs and other relevant data as well .I am contributing to a project for the first time . I appreciate any reviews and comment on it. Thank you
https://github.com/bokeh/outreach-programs/issues/6#issuecomment-1465110443 Thank you @Ajoke23 Do you have an idea on how I can make my plots to show in my notebook on GitHub gist ?
Hi, here is my submission for the microtask on the project, Create a blog post series: "Fundamentals of Data Visualization in Bokeh." https://github.com/Azaya89/Bokeh-microtask
Attached in a separate images folder are the plots that were generated inline. For some reason, they do not appear inline in the notebook here on github.
https://github.com/bokeh/outreach-programs/issues/6#issuecomment-1465110443 Thank you @Ajoke23 Do you have an idea on how I can make my plots to show in my notebook on GitHub gist ?
To show plot: You do show(variable name) Variable name assign when creating the plot
Hello, @pavithraes @anisheremariam please take a look at my first assignment on the analysis of NYC Taxi data on Jupyter Notebook. https://gist.github.com/Isaakkamau/358d2ccff3612d95496972fa67842021
@Isaakkamau, that is fine work. keep up the good work.
@anisheremariam thanks a lot, but how many visualizations are we supposed to have? I decided first to do one then I can add others if it's needed
Hello @pavithraes @anisheremariam
Please find my contribution for task 1 here Your feedback would be appreciated.
I added the visualizations as comment since github gist cant render it from my notebook.
Hi, here is my submission for the microtask on the project, Create a blog post series: "Fundamentals of Data Visualization in Bokeh." https://github.com/Azaya89/Bokeh-microtask
Attached in a separate images folder are the plots that were generated inline. For some reason, they do not appear inline in the notebook here on github.
Well done @Azaya89, you did a great work
Hello @pavithraes @anisheremariam
Please find my contribution for task 1 here Your feedback would be appreciated.
I added the visualizations as comment since github gist cant render it from my notebook.
You did a great work. Well done @PatChizzy. Unique and creative visualization
Hi all thanks for the submissions so far! This is our first time doing outreachy so this is a learning experience for us as well! One thing that has become apparent is that it is a bit confusing and difficult to provide individualized comments when all the submissions are mixed together in one place like this! I'd like to ask everyone who has submitted here to open a new issue that has any relevant links, images, etc for your work. This will allow us to have 1-1 conversations with everyone on their own issue :)
Hi all thanks for the submissions so far! This is our first time doing outreachy so this is a learning experience for us as well! One thing that has become apparent is that it is a bit confusing and difficult to provide individualized comments when all the submissions are mixed together in one place like this! I'd like to ask everyone who has submitted here to open a new issue that has any relevant links, images, etc for your work. This will allow us to have 1-1 conversations with everyone on their own issue :)
For those that might been having issue figuring it out you can follow this steps. To do this kindly: 1.visit the link to the project on Github https://github.com/bokeh/outreach-programs/issues/6
That's all. I hope this helps someone
#6 (comment) Thank you @Ajoke23 Do you have an idea on how I can make my plots to show in my notebook on GitHub gist ?
To show plot: You do show(variable name) Variable name assign when creating the plot
I think the issue here is not the code written. What I've been able to figure out is that using output_notebook()
in the jupyter notebook is what renders it inline on your notebook but exporting the notebook to github won't render the plots since output_notebook()
is not running on github. So it's best to also post the plots as a separate image.
I hope this helps.
#6 (comment) Thank you @Ajoke23 Do you have an idea on how I can make my plots to show in my notebook on GitHub gist ?
To show plot: You do show(variable name) Variable name assign when creating the plot
This works quite alright in my python environment. However, when the notebook has been downloaded, the images do no show
#6 (comment) Thank you @Ajoke23 Do you have an idea on how I can make my plots to show in my notebook on GitHub gist ?
To show plot: You do show(variable name) Variable name assign when creating the plot
I think the issue here is not the code written. What I've been able to figure out is that using
output_notebook()
in the jupyter notebook is what renders it inline on your notebook but exporting the notebook to github won't render the plots sinceoutput_notebook()
is not running on github. So it's best to also post the plots as a separate image.I hope this helps.
Okay @Azaya89. I'm gonna try it out. Thanks
Okay @Azaya89. I'm gonna try it out. Thanks
You're welcome.
The New York City TLC taxi trips records data is frequently used for creating examples and tutorials for Python data science workflows. You can access the dataset through any of the following ways:
Note that the actual dataset is quite large, so please use a subset of the data or consider reducing it.
To complete this micro-task, download and explore a subset of the dataset with Bokeh plots. You can share your Jupyter Notebooks with us as a GitHub gist. As per Bryan's comment here, please open separate issues/PRs with your wok, so that we can share feedback individually.