bryzgaloff / airflow-clickhouse-plugin

The most popular ClickHouse plugin for Airflow. 🔝 Top-1% downloads on PyPI: https://pypi.org/project/airflow-clickhouse-plugin! Based on mymarilyn/clickhouse-driver.
MIT License
144 stars 36 forks source link

publishing to pypi workflow #63

Closed MaximTar closed 1 year ago

MaximTar commented 1 year ago

Tried to reproduce the pipeline described in #42 as a github workflow.

Two secrets are required for correct operation: TEST_PYPI_API_TOKEN and PYPI_API_TOKEN for publishing to TestPyPI and PyPI respectively.

And please watch #62 when you have free time, @bryzgaloff.

Happy New Year and happy holidays!

MaximTar commented 1 year ago

I tried. In short, it was not possible to make an adequate sensor. In the case of TestPyPI, there is no intelligible search for the package, so I did this first:

- name: Wait publishing to TestPyPI
run: |
  AVAILABLE_VERSION="wget -qO- https://test.pypi.org/project/airflow-clickhouse-plugin/ | grep -c $(eval $VERSION)"
  until [[ $(eval "$AVAILABLE_VERSION") > 0 ]]
  do
    sleep 10
  done
  AVAILABLE_PACKAGE_1="wget -qO- https://test.pypi.org/project/airflow-clickhouse-plugin/$(eval $VERSION)/#files | grep -c .tar.gz"
  AVAILABLE_PACKAGE_2="wget -qO- https://test.pypi.org/project/airflow-clickhouse-plugin/$(eval $VERSION)/#files | grep -c .whl"
  until [[ $(eval "$AVAILABLE_PACKAGE_1") > 0 ]] && [[ $(eval "$AVAILABLE_PACKAGE_2") > 0 ]]
  do
    sleep 10
  done

However, the presence of a page with packages does not guarantee that the package will install (I think the problem is in the internal indexing): ERROR: No matching distribution found for airflow-clickhouse-plugin[pandas]==VERSION Similarly, with PyPI I searched with the help of poetry:

  AVAILABLE="poetry search airflow-clickhouse-plugin | grep airflow-clickhouse-plugin | grep -oP '\(\K[^)]+'"
  until [ $(eval "$AVAILABLE") == $(eval $VERSION) ]
  do
    sleep 10
  done

But as it turned out, this also does not guarantee anything (the same error occurred during testing). Therefore, in the end, a crutch is used there that checks the installation and repeats it until it successfully completes:

  INSTALLED="python -m pip list | grep -F airflow-clickhouse-plugin | grep -c $(eval $VERSION)"
  until [[ $(eval "$INSTALLED") > 0 ]]
  do
    sleep 10
    python -m pip install ... || true
  done

Part || true is needed here so that in a situation where the package is not installed (because it is not yet available), the script does not crash.

Other comments tried to take into account.

MaximTar commented 1 year ago

@bryzgaloff, apparently, this pr was lost in a series of other messages, so I am writing to remind you of it (: any ideas/suggestions/comments?

bryzgaloff commented 1 year ago

Hi @MaximTar thank you fr a reminder, the PR is still on my TODO list, unfortunately I have been quite busy in the recent weeks -_-

I will prioritize it to provide you with a feedback sooner!

bryzgaloff commented 1 year ago

@MaximTar I am assigning the PR to you. Please assign it back to me once you want me to answer your questions or review. I will mute the PR till then to be not notified about your ongoing work until it is ready for review.

bryzgaloff commented 1 year ago

Thank you @MaximTar for you effort here, I have implemented the publishing workflow myself in #71: PyPA has a nice guide how to do this which I have successfully followed and added the tests too. Your contribution was a valuable first step! 🔥