dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
11.54k stars 1.45k forks source link

Integrations: tracking list of library operators/resources to build #1302

Closed natekupp closed 5 years ago

natekupp commented 5 years ago

This issue serves as a running list of integrations that we have vs. what we still need to build.

Hadoop

🚫 Presto 🚫 Hive 🚫 distcp, also better S3 / GCS support 🚫 Sqoop 🚫 Pig

AWS

✅ EMR ✅ S3 🚧 Move Redshift from airline_demo to dagster_aws (#1371) 🚫 Sagemaker

GCP

✅ BigQuery ✅ Cloud Dataproc 🚫 Cloud Bigtable 🚫 Cloud Dataflow 🚫 Cloud Datastore

Azure

🚫 TBD?

Miscellaneous

🚫 Bash 🚫 Generic JDBC 🚫 FTP / SFTP 🚫 MongoDB 🚫 Segment 🚫 MySQL 🚫 Postgres (think about the overlap w/ redshift here?) 🚫 Druid / Pinot 🚫 Sendgrid 🚫 Twilio 🚫 Discord 🚫 Papertrail (should be modeled as part of @mgasner's logging work instead?) 🚫 Loggly

natekupp commented 5 years ago

Moving this to the wiki

bruth commented 4 years ago

Clicking on the wiki it appears there are no pages (it redirects back to the main repo page)? I was curious of the integration with Presto in particular from the list. Another idea/need are running Kubernetes jobs.