[!NOTE] Hi Followers, Thank you for taking the time to read me. Let me help you understand the scope and progess with better ease below:
- Projects
- Milestones
- Issues
- Pull Request
- Wiki
- Documentation
Milestone | Epic | Target Date |
---|---|---|
0.0.1 | Ready for Feedback | 1st Oct 24 |
1.0.0 | Ready for Production | 1st Nov 24 |
Business information systems require fresh data every day organised in a manner that retrival is cost effective. Making a local data platform requires a setup where you can recreate production usecases and develop new pipelines.
What? : a local data platform that can scale up to cloud Why? : save costs on cloud infra and developement time When? : start of product development life cycle Where? : local first Who? : Business who want a product data platform that will run locally and scale up when the time comes.
A python library that uses open source tools to orchestrate a data platform operations locally for development and testing
Data can be available as single file in the source format. For example New York Yellow taxi data is available to be pulled from here
curl https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2023-01.parquet -o /tmp/yellow_tripdata_2023-01.parquet
local-data-platform/
Human readable format and accessible platforms like google sheets or notion Easily pushed into