kuwala-io / kuwala

Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data science models and products with a focus on geospatial data. Currently, the following data connectors are available worldwide: a) High-resolution demographics data b) Point of Interests from Open Street Map c) Google Popular Times
https://kuwala.io
Apache License 2.0
778 stars 51 forks source link

New Pipeline: Instagram post scraper #60

Open iam-benyamin opened 2 years ago

iam-benyamin commented 2 years ago

This is going to be a new pipeline scraping public posts on Instagram with meta information such as location and hashtags.

mattigrthr commented 2 years ago

It should be possible to scrape public Instagram posts using hashtags, locations, and (public) users. This article provides some insights and ideas: https://blog.apify.com/scrape-instagram-posts-comments-and-more-21d05506aeb3/

Since @bmahmoudyan can't continue working on this issue, it's up for grabs again. :)

mattigrthr commented 2 years ago

The requirements for a new pipeline are the following:

arifluthfi16 commented 2 years ago

I am interested in taking this issue, i will be splitting the pipelines into 2 parts:

just like the Google POI pipelines

Currently still working on the instagram scrapper and looking at what's possible.

arifluthfi16 commented 2 years ago

PR for this issue #74