apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
36.93k stars 14.26k forks source link

Direct users to new docs #29283

Open BasPH opened 1 year ago

BasPH commented 1 year ago

What do you see as an issue?

The old docs are still online (https://airflow.apache.org/docs/apache-airflow/1.10.4/index.html) and there's no clear indication of these pages being old and no link to the new docs.

Some Google searches lead to the old docs, e.g. my first result googling for airflow web server port brings me to the Airflow 1.10.4 docs.

Solving the problem

I see multiple solutions:

Anything else

No response

Are you willing to submit PR?

Code of Conduct

Taragolis commented 1 year ago

+100500

Also time to time https://airflow.readthedocs.io/ has bigger priority in google rather than https://airflow.apache.org/

Taragolis commented 1 year ago

Some additional finding about robots.txt

Seems like for airflow.apache.org all pages have same weights for search engines.

potiuk commented 1 year ago

Absolutely yes.

potiuk commented 1 year ago

I think removal is not a good idea but large warning and no-indexing/ some hinting that the old versions should not be indexed are all good.

When the page disappears from search enging it's all but gone (Except those who want to follow direct links)

We could simply find a new place for those old docs and publish them statically elsewhere after we apply warnings.

I am also inclined to do that wilth (some of) the past versions of Airflow 2.

jedcunningham commented 1 year ago

+1 to:

I don't think we should remove old docs completely though.

vincbeck commented 1 year ago

I was looking into working on this issue but could not find where the robots.txt file of Airflow website was defined. Do you know by any chance?

o-nikolas commented 1 year ago

I was looking into working on this issue but could not find where the robots.txt file of Airflow website was defined. Do you know by any chance?

We let Hugo generate it: https://github.com/apache/airflow-site/blob/bc075d5a2049fed3ea71f27fac08712b043beb18/landing-pages/site/config.toml#L20

Right now it's on "/" so search engines basically search all pages. But we can customize it by providing a robots.txt in a few locations (see the docs here)

vincbeck commented 1 year ago

Thanks! Looking at it! A user actually got confused by this today. See Slack thread.

github-actions[bot] commented 2 weeks ago

This issue has been automatically marked as stale because it has been open for 365 days without any activity. There has been several Airflow releases since last activity on this issue. Kindly asking to recheck the report against latest Airflow version and let us know if the issue is reproducible. The issue will be closed in next 30 days if no further activity occurs from the issue author.