sodafoundation / crystal

crystal provides unified metadata management platform for unstructured data for cloud and more.
Apache License 2.0
3 stars 3 forks source link

Requirements for SODA Crystal - Add your pain points OR use cases OR feature requirements #9

Open skdwriting opened 7 months ago

skdwriting commented 7 months ago

Issue/Feature Description: You can add all your pain points OR use cases OR feature requirements for SODA Crystal project Project Focus : Unstructured Metadata management

You can add all your inputs in the comments. We will brainstorm to bring the first list

Reference You can refer to some of the basic information collected or prepared for SODA Crystal here

Example 1: We are struggling with metadata search for s3, especially the performance. Please find our concern details here OR attach the information.

Example 2: I found a project which handles unstructured metadata management. Can we take inputs from there?

Example 3: Feature request : Storing huge amount of IOT data in a common format can be a good feature? <More info here - link can be added or attach>

Example 4: Do we consider data lake pain points like ?

skdwriting commented 7 months ago

Adding the input from @thatsdone here #8

thatsdone commented 7 months ago

(Possible) Feature Request

Add OCI (Oracle Cloud Infrastructure) Object Storage Service support.

As I think we need to discuss support matrix topic with Strato, I filed an issue (https://github.com/sodafoundation/strato/issues/1425) there too.

skdwriting commented 4 months ago

Comments from Rakesh, IBM:

skdwriting commented 4 months ago

Comments from Rakesh, IBM (SODA TOC) - Apache Parquet file management for large data set, schema, - there are challenges. Lake House solution solves (Apache Iceberg, Hudi). However if we can provide a specific solution then many organizations, it will be useful where they are not able to deploy lake house kind of solution.

dinkar--s commented 4 months ago

From the competitive analysis, it looks like there could be two different focus areas for Crystal - (1) intelligent search of data sets and (2) data management using metadata. These two would have different requirements.

Searching inside unstructured data could be an area to explore (Subhankar)