python / bugs.python.org

Meta-issue tracker for bugs.python.org
11 stars 6 forks source link

What happened to "topic" URLs? #67

Open abravalheri opened 8 months ago

abravalheri commented 8 months ago

In the setuptools docs there are some old references to URLs like bugs.python.org/setuptools/issue38, bugs.python.org/setuptools/issue41, etc...

Does anyone know where did these URLs end up? Or were they wrong in the first place?

From the point of view of preservation and retrieving information, it would be nice if we could retain these tickets as reference/read-only.

Multiple times I find myself going through the commit history to understand why something is implemented in a certain way, or if I can remove some parts of the code that look like they are not doing anything... And more often than not I find comments referencing old issues and PRs URLs. If the links are lost, that impacts negatively on the understanding and the maintenance of the code.

terryjreedy commented 8 months ago

Numbered bpo urls have links to gh urls that have any further activity. In your mentioned PR, you could have further updated "https://bugs.python.org/issue8557" to "https://github.com/python/cpython/issues/52803". There is no more activity yet, but it is still open. (However, no one has yet been willing to do a mass auto-update, even if possible.)

@gvanrossum Do you remember anything about no-number 'topic' urls?

@abravalheri This tracker is for action items; generally questions should go elsewhere. If no answer here, try https://discuss.python.org/c/documentation/26, where your question might be seen by 'old-timers' who don't read current issues.

gvanrossum commented 8 months ago

Sorry, I don't remember any details about bpo -- I was never involved in setting it up or administering it.

hugovk commented 8 months ago

Ping @ezio-melotti about BPO.

Numbered bpo urls have links to gh urls that have any further activity. In your mentioned PR, you could have further updated "bugs.python.org/issue8557" to "python/cpython#52803". There is no more activity yet, but it is still open. (However, no one has yet been willing to do a mass auto-update, even if possible.)

I have a mass auto-update tool for normal BPO->GH issues: https://github.com/hugovk/github-tools/blob/main/bpo_redirecter.py

Here's a branch updating setuptools, I can open a PR if it's useful: https://github.com/pypa/setuptools/compare/main...hugovk:setuptools:update-bpo-links?expand=1

ezio-melotti commented 7 months ago

Does anyone know where did these URLs end up? Or were they wrong in the first place?

https://bugs.python.org/setuptools/ wasn't a "topic", but it was one of the trackers (aka instances) that we used to host on Roundup (together with the b.p.o, Roundup, Jython, and meta-tracker -- all separate issue trackers). IIRC it was one of the first to migrate (apparently to https://bitbucket.org/pypa/setuptools/issues first, and then to https://github.com/pypa/setuptools/issues), long before CPython did.

I tried to search the setuptools GitHub tracker for some of the issues referenced in the page you linked, but I couldn't find them. It might be possible that only open issues got copied over (manually) to GitHub when the migration(s) happened and that those old issues are lost.

If you want to link to something, one option is to use the Wayback Machine, e.g. http://web.archive.org/web/20170315145453/http://bugs.python.org/setuptools//issue2 That could also be a good starting point if you want to try to find the exact title and see if they are still somewhere on GitHub.

@ewdurbin might have a backup of the setuptools instance, so you could ping them if you want to try to restore those old issues somehow/somewhere.

See also these issues for some background:

P.S.: thanks @hugovk for the ping!

abravalheri commented 7 months ago

Thank you very much for the information @ezio-melotti, that is very helpful!

@ewdurbin might have a backup of the setuptools instance, so you could ping them if you want to try to restore those old issues somehow/somewhere.

@ewdurbin, would it be possible to do something like the suggestion above? Do you happen to have this backup?

ewdurbin commented 7 months ago

Yes a backup exists. It won't be trivial to prepare in a useful manner.

The tracker directory with the files/msgs directories are pretty easy.

The database will require working from an ancient Postgres data directory since a dump of just the setup tools tracker db does not exist.

Are both needed for the purpose?

ewdurbin commented 7 months ago

Ok, wasn't too bad to extract a pg_dump of the db.

@ezio-melotti a tarball of the files/msgs and a pg dump of the tracker db are in your home directory on bugs.ams1.psf.io.

ee@bugs:~$ sudo ls -alhtr /home/psf-users/ezio/ | grep setuptools
-r--r-----  1 ezio ezio 5.2M Nov 28 13:37 tracker-setuptools.tgz
-r--r-----  1 ezio ezio 1.2M Nov 28 13:37 roundup_setuptools.pgdump
ezio-melotti commented 7 months ago

Thanks @ewdurbin for looking into this!

@abravalheri, how do you want to proceed from here? IIUC your initial goal was to fix those links, so I see 3 options:

  1. find an existing online copy of those issues and link to that (e.g. the Wayback Machine);
  2. if that's not available, restore the lost issues from the backup (somewhere/somehow);
  3. remove the links and replace them with a summary of the issue (maybe in a footnote)
abravalheri commented 7 months ago

Thank you very much, @ezio-melotti and @ewdurbin

The tracker directory with the files/msgs directories are pretty easy.

I think the most important thing is indeed the messages that have been exchanged. Not sure how the dB would be used.

Regarding the options discussed by @ezio-melotti , probably 1 or 2 would work better because we don't "loose information" in the process of creating a summary (sometimes the devil is on the details).

I am going to be away from a proper computer for the next month, but I can have a look on option 1 once I am back. One question though, how "stable"[^1] are wayback machine links? I have never used them for information archival purposes, so I don't know what are the quirks...

[^1]: In the long run I mean...

hugovk commented 7 months ago

One question though, how "stable"1 are wayback machine links? I have never used them for information archival purposes, so I don't know what are the quirks...

Once archived, they should be around for the long run (unless the original site owner requests deletion).

When it comes to replacing 404 links with Wayback Machine links, generally it's better to find a live version or alternative if possible, as Wayback Machine can be a bit slow to load.

But if there's no other version, or the original is important, then it's fine to link to Wayback Machine.