.parquet not .pq - Githubissues

gdmcbain commented 1 month ago

Description

As discussed on the Slack yesterday, the tutorial at the step Create a data processing pipeline writes Parquet as .pq which doesn't seem to be the standard suffix, which is .parquet; it's not recognized by, for example, Data Wranger.

I propose changing all uses of .pq to the standard .parquet.

Documentation page (if applicable)

Searching for .pq matches five files, two in tests and three in docs; e.g.:

https://github.com/kedro-org/kedro/blob/c2d7100a6bdf0dd51e80a2eade0b5d3f3a71184b/docs/source/tutorial/create_a_pipeline.md?plain=1#L203

deepyaman commented 1 month ago

@gdmcbain Do you want to open a PR to address this? I agree that .pq is nonstandard (or at least .parquet is much more common) and see no reason not to make this docs change.

gdmcbain commented 1 month ago

Yes, just running the test suite locally now. Ta.

SajidAlamQB commented 1 month ago

Completed in: https://github.com/kedro-org/kedro/pull/4254

kedro-org / kedro

.parquet not .pq #4253

Description

Documentation page (if applicable)