apache / parquet-site

Apache Parquet Site
https://parquet.apache.org/
Apache License 2.0
8 stars 33 forks source link

PARQUET-2259: Update site to sync with latest parquet-format #31

Closed wgtmac closed 10 months ago

wgtmac commented 1 year ago

I have copied corresponding text from parquet-format to make it easy to update in the future. Please take a look, thanks! @gszadovszky @shangxinli

gszadovszky commented 1 year ago

Thanks for taking care of this, @wgtmac! Have you copied from master or from the latest release? (I think, the latest release would be preferred.) Also, what do you think about updating the release process to ensure we would always do this update in case of a parquet-format release?

wgtmac commented 1 year ago

Have you copied from master or from the latest release? (I think, the latest release would be preferred.)

I copied from master because the latest v2.9.0 was released almost two years ago.

what do you think about updating the release process to ensure we would always do this update in case of a parquet-format release

This is a good idea. I will try to update it.

BTW, what is the difference between staging and production branch of this repo? I assume we should push to staging first then production?

gszadovszky commented 1 year ago

Even though the last release is old without the release no one should implement the new features. It should work similarly to the releases of implementations like parquet-mr. So I would vote on copying to latest available release to the home page. (We might even state the version number there and that other releases/master is available on github.)

TBH, I am not sure about staging. Just realized you've created this PR for staging. There should be a staging site for parquet but it should be available for testing. I don't think we would need PR process for staging (at least not for committers). The branch production should clearly update the site parquet.apache.org. The last time I've updated this site it was working differently. @shangxinli, could you help understanding how site should be updated? At least we should mention the actual sites for stagin and prod in the README.

wgtmac commented 1 year ago

There is a document for staging and production but I still don't know where is the staging site.

OK, I will close this PR since it is targeted for staging branch.

gszadovszky commented 1 year ago

@wgtmac, I've found it finally: https://parquet.staged.apache.org/ I don't think staging makes sense this way. The two branches are already diverged from each other. I think a better approach would be a way that we can check the result for an actual PR before merging it. Anyway, we do not update site that frequently. Let's put a PR for production and continue there.