🔮 Instill Core is a full-stack AI infrastructure tool for data, model and pipeline orchestration, designed to streamline every aspect of building versatile AI-first applications
When there are multiple data sources in companies, the data engineers in the companies need to migrate data from a source to another source.
The data is scattered around in applications, it is time-consuming for a company to write several tools to collect the data from applications, such as Gmail / Slack / ….
Describe Your Proposed Solution
User stories
Story 1
As a data engineer, he/she wants to migrate raw data to analysable data to another data source
Possible pipelines
Concrete examples
e.g. transaction data is not analysable, but weekly transaction amount & transaction count are.
Story 2
As a data engineer, he/ she wants to transform unstructured data into analysable data and load to another data source.
Possible pipelines
Concrete example
Highlight the Benefits
It can solve the problem in the real world.
Anything Else?
Possible components
Note: The sequence means the priority.
Data components
RDBMS
AWS
RDS
GCP
Cloud SQL / BigQuery
Postgres
MySQL
MSSQL
Oracle DB
…
NoSQL
AWS
NoSQL (DynamoDB / MongoDB)
GCP
Datastore
MongoDB
Elasticsearch
Cassandra
…
Vector DB
Weaviate
Qdrant
Chroma
Zilliz
Milvus
Others
AWS
S3
GCP
Google Cloud Storage
AWS Datalake
Google Sheet
…
Application components
Discord / X / Slack / … are expected to built from other tools. But, you could need to build a specific TASK for Application component according to your usage.
Please notify in Slack if there are further concrete idea that you want to build some specific application components. We can discuss those in details.
Is There an Existing Issue for This?
Where do you intend to apply this feature?
Instill Core, Instill Cloud
Is your Proposal Related to a Problem?
Background
When there are multiple data sources in companies, the data engineers in the companies need to migrate data from a source to another source.
The data is scattered around in applications, it is time-consuming for a company to write several tools to collect the data from applications, such as Gmail / Slack / ….
Describe Your Proposed Solution
User stories
Story 1
Possible pipelines
Concrete examples
e.g. transaction data is not analysable, but weekly transaction amount & transaction count are.
Story 2
As a data engineer, he/ she wants to transform unstructured data into analysable data and load to another data source.
Possible pipelines
Concrete example
Highlight the Benefits
It can solve the problem in the real world.
Anything Else?
Possible components
Data components
RDBMS
NoSQL
Vector DB
Others
Application components
Reference tools
Milestones
Note