e-mission / e-mission-docs

Repository for docs and issues. If you need help, please file an issue here. Public conversations are better for open source projects than private email.
https://e-mission.readthedocs.io/en/latest
BSD 3-Clause "New" or "Revised" License
15 stars 34 forks source link

using timeseries_sample.ipynb, steps to understanding the data #651

Open jruzekowicz opened 3 years ago

jruzekowicz commented 3 years ago

Below are a few outlined steps I used to begin understanding the transformation of raw data through the pipeline stages.

  1. I would first suggest opening one of the raw data files such as: https://raw.githubusercontent.com/e-mission/e-mission-server/master/emission/tests/data/real_examples/shankari_2015-jul-22
  2. Within the raw data file, you are able to visualize the layout of the data. Next, you may want to see how this data is processed through the pipeline, using the provided notebook, timeseries_sample.ipynb
  3. Open this notebook, it should be set to use the July 22nd data. The notebook allows you to step through the small bits of code, teaching you have to access the database, and objects such as the confirmed and cleaned trips and sections.
  4. I would add print statements to each variable you are unsure of, and run that line of code within the notebook. This will allow you to visualize the process and see how the pipeline and query pull from the stored data after it has run through the intake pipeline.

Some videos to help you set up and understand how to use the Jupyter Notebook: https://www.youtube.com/watch?v=DKiI6NfSIe8 , https://www.youtube.com/watch?v=HW29067qVWk

shankari commented 3 years ago

@jruzekowicz can you maybe submit this as a PR to the documentation, either here or in the server repo?