apache / beam

Apache Beam is a unified programming model for Batch and Streaming data processing.
https://beam.apache.org/
Apache License 2.0
7.9k stars 4.27k forks source link

[Task]: Migrate `sdks/python/apache_beam/testing/` to `beam/examples/notebooks`. #26808

Closed svetakvsundhar closed 1 year ago

svetakvsundhar commented 1 year ago

What needs to happen?

26604 added an integration test script to a new folder sdks/python/apache_beam/testing/notebooks/, while current notebooks are all under beam/examples/notebooks/ (https://github.com/apache/beam/tree/master/examples/notebooks).

It would make more sense to move the test script here, to allow for the .spec files, the .ipynb notebook, and the test scripts to all be in the same folder.

The other (less preferable) alternative, is to add a copy of each notebook under sdks/python/apache_beam/testing/notebooks/. This would require everyone checking in notebooks to check them into beam/examples/notebooks as well as sdks/python/apache_beam/testing/notebooks/, which could eventually lead to inconsistency issues between the two.

Issue Priority

Priority: 3 (nice-to-have improvement)

Issue Components

svetakvsundhar commented 1 year ago

WDYT @damccorm @liferoad ?

liferoad commented 1 year ago

I don't think we can cover all notebooks under examples without changing these notebooks. You probably need to look at the most popular workflow and only add these as notebook tests.

damccorm commented 1 year ago

I think moving the test script to the notebooks folder makes sense. That should make for a more reasonable developer workflow (run pytest from the same directory you added the notebook).

The other (less preferable) alternative, is to add a copy of each notebook under sdks/python/apache_beam/testing/notebooks/. This would require everyone checking in notebooks to check them into beam/examples/notebooks as well as sdks/python/apache_beam/testing/notebooks/, which could eventually lead to inconsistency issues between the two.

I don't understand why this is true; do the notebooks have to be in the same folder in order to run tests on them? That seems like an fixable challenge. Regardless, I'm still +1 on the idea.

Nit: I assume the title should read "Migrate sdks/python/apache_beam/testing/**notebooks/** ...", right?

svetakvsundhar commented 1 year ago

I don't understand why this is true; do the notebooks have to be in the same folder in order to run tests on them? That seems like an fixable challenge. Regardless, I'm still +1 on the idea.

No they don't, what I am intending to convey is that it'll be a better developer experience to have it all under one directory.

Nit: I assume the title should read "Migrate sdks/python/apache_beam/testing/**notebooks/** ...", right?

Yes.

Sounds like we are in agreement here -- I'll put up a PR shortly.