Open rterror opened 2 years ago
[论文] Photon: A Fast Query Engine for Lakehouse Systems #3 https://cs.stanford.edu/people/matei/papers/2022/sigmod_photon.pdf
[论文] Computation reuse via fusion in Amazon Athena #4 https://www.amazon.science/publications/computation-reuse-via-fusion-in-amazon-athena
[rebalence] https://blog.csdn.net/monkeyboy_tech/article/details/125548654
[Apache Yarn 在B站优化实践] https://www.iteblog.com/archives/10165.html
[Spark 3.3 new feature] https://www.iteblog.com/archives/10185.html
[OPPO big data offline computing architecture] https://www.iteblog.com/archives/10070.html
[Uber reduce big data platform cost] https://www.iteblog.com/archives/10009.html
[spark streaming] https://www.databricks.com/blog/2022/06/28/project-lightspeed-faster-and-simpler-stream-processing-with-apache-spark.html
[SPIP stage level resource allocation] https://issues.apache.org/jira/browse/SPARK-27495
https://www.iteblog.com/ppt/data-ai-summit-2021/stage-level-scheduling-improving-big-data-and-ai-integration_iteblog.com.pdf
[Google Napa] Google Napa Goole 在其内部已经用 Napa 代替了 Mesa,在线上大规模使用了,并且在 VLDB 2021 上发了一篇论文 https://research.google/pubs/pub50617/ 另外还有一个公开的talk: https://www.youtube.com/watch?v=dtWwUWB5JyQ 有空的时候可以读一读这篇文章,听听这个 tech talk
https://blog.zhuangty.com/napa/
[cpp] Herb 的新的talk,明天这个时候首播,搞C++ 的同学可以看看 Can C++ be 10x Simpler & Safer? - Herb Sutter - CppCon 2022 https://www.youtube.com/watch?v=ELeZAKCN4tY
感兴趣的同学也可以关注herb的blog https://herbsutter.com/gotw/
[docker on yarn] https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.0/data-operating-system/content/run_docker_containers_on_yarn.html
[partial agg] https://www.slidestalk.com/slidestalk/1AggregatePushDownthroughJoinlatest90930?video
[spark 3.x plugin framework] https://blog.csdn.net/monkeyboy_tech/article/details/115969494 https://canali.web.cern.ch/docs/Spark_Dashboard_Demo.mp4 https://github.com/cerndb/SparkPlugins
Todo:
[论文] Photon: A Fast Query Engine for Lakehouse Systems #3 https://cs.stanford.edu/people/matei/papers/2022/sigmod_photon.pdf
[论文] Computation reuse via fusion in Amazon Athena #4 https://www.amazon.science/publications/computation-reuse-via-fusion-in-amazon-athena
[rebalence] https://blog.csdn.net/monkeyboy_tech/article/details/125548654
[Apache Yarn 在B站优化实践] https://www.iteblog.com/archives/10165.html
[Spark 3.3 new feature] https://www.iteblog.com/archives/10185.html
[OPPO big data offline computing architecture] https://www.iteblog.com/archives/10070.html
[Uber reduce big data platform cost] https://www.iteblog.com/archives/10009.html
[spark streaming] https://www.databricks.com/blog/2022/06/28/project-lightspeed-faster-and-simpler-stream-processing-with-apache-spark.html
[SPIP stage level resource allocation] https://issues.apache.org/jira/browse/SPARK-27495
https://www.iteblog.com/ppt/data-ai-summit-2021/stage-level-scheduling-improving-big-data-and-ai-integration_iteblog.com.pdf
[Google Napa] Google Napa Goole 在其内部已经用 Napa 代替了 Mesa,在线上大规模使用了,并且在 VLDB 2021 上发了一篇论文 https://research.google/pubs/pub50617/ 另外还有一个公开的talk: https://www.youtube.com/watch?v=dtWwUWB5JyQ 有空的时候可以读一读这篇文章,听听这个 tech talk
https://blog.zhuangty.com/napa/
[cpp] Herb 的新的talk,明天这个时候首播,搞C++ 的同学可以看看 Can C++ be 10x Simpler & Safer? - Herb Sutter - CppCon 2022 https://www.youtube.com/watch?v=ELeZAKCN4tY
感兴趣的同学也可以关注herb的blog https://herbsutter.com/gotw/
History:
[docker on yarn] https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.0/data-operating-system/content/run_docker_containers_on_yarn.html
[partial agg] https://www.slidestalk.com/slidestalk/1AggregatePushDownthroughJoinlatest90930?video
[spark 3.x plugin framework] https://blog.csdn.net/monkeyboy_tech/article/details/115969494 https://canali.web.cern.ch/docs/Spark_Dashboard_Demo.mp4 https://github.com/cerndb/SparkPlugins