Ability to extract and ingest content into webiny from multiple sources

codeweft commented 2 years ago

Is your feature request related to a problem? Please describe.

We need to be able to display content from multiple sources on a webpage.

The sources are say rss feeds provided by a api or scraped content, we will transform millions of files and ingest the content into webiny using a GraphQL api.

The ingested content should then be published by webiny on a webpage

There are serverless aws services like step functions, lambdas, glue and athena that can help in gathering, transforming and then publishing(ETL) the data to webiny GraphQL api. However, we will have to write custom code in python(with Pulumi) to manage all that, its time consuming and not easily configurable.

Describe the solution you'd like.

A simple drag and drop serverless open source solution for ETL pipelines is missing in the industry. Wondering if webiny's architecture is extensible enough to fill that gap in some way and would the team consider it building in the future.

This will make webiny not just a content manager and publisher, but also an extractor.

Describe alternatives you've considered.

Services like n8n and Airbyte solve this problem today in some way, but they are not serverless with potential scalability issues.

A scalable serverless solution for data extraction and ingestion would align really well with webiny's architecture.

codeweft commented 2 years ago

Slack discussion thread here:

https://webiny-community.slack.com/archives/C014Y0HGJ0Y/p1644462213539639?thread_ts=1644462213.539639&cid=C014Y0HGJ0Y

Thanks again.

webiny-bot commented 2 years ago

This issue is stale because it was opened 120 days with no activity. Remove the "stale-issue" label or leave a comment to revive the issue. Otherwise, it will be closed in 7 days.

webiny / webiny-js