ept / ddia2-feedback

Reader feedback on the early release of Designing Data-Intensive Applications, second edition
37 stars 1 forks source link

Chapter 3: Query languages for documents #20

Closed mpalriwal-Netflix closed 3 hours ago

mpalriwal-Netflix commented 3 days ago

Book Link

SQL Pipe Syntax :: Query Languages Section

I came across Google's research paper, "SQL Has Problems. We Can Fix Them: Pipe Syntax In SQL," which introduces a piped data flow syntax to SQL. This could enhance the section on query languages by showing how SQL is evolving. For example, the shark sightings query could be expressed as

FROM observations
|> WHERE family = 'Sharks'
|> AGGREGATE SUM(num_animals) AS total_animals
   GROUP BY date_trunc('month', observation_timestamp) AS observation_month

This approach aims to address some of SQL's traditional challenges, such as its complexity and steep learning curve, by extending the language rather than replacing it.


Rationale:

criccomini commented 3 days ago

I'm hesitant to add this. It's a little too early to know if it's going to be widely adopted. Currently, BigQuery adopted it. It sounds like Spark has plans to adopt it as well. But it's still quite nascent as I understand it. We also stripped out some of the data processing discussion around Pig, which is very similar in its goals.

mpalriwal-Netflix commented 2 days ago

Thank @criccomini for sharing your perspective. I understand your hesitation, and it makes sense to be cautious about including technologies that are still in their early stages of adoption.

For Observability system, we host a working-group under CNCF for query-standardization and we are also actively considering something similar to pipe SQL (or even simpler) to become the standard open source for query open telemetry data.

https://github.com/cncf/tag-observability/blob/main/working-groups/query-standardization.md

ept commented 3 hours ago

I agree with @criccomini; I don't think we should be going into vendor-specific SQL extensions that are not yet widely adopted. If we were going to cover recent developments in SQL, I'd be more inclined to cover GQL, an extension of SQL to support Cypher-like graph queries, which at least is an ISO standard as of a few months ago.