python / pythondotorg

Source code for python.org
https://www.python.org
Apache License 2.0
1.51k stars 602 forks source link

Missing donors from new donors list #2522

Closed ezio-melotti closed 2 months ago

ezio-melotti commented 2 months ago

Describe the bug The legacy donors list was migrated to a new page -- see:

However there are over 1000 donors missing from the new page, which only includes 4820 names compared to the 5923 of the legacy page.

To Reproduce Open the two pages and count the names.

Expected behavior I would expect the current page to have all the old names, plus any new donor that donated after the legacy page stopped being updated.

Additional context See https://github.com/python/pythondotorg/issues/709#issuecomment-2330949133. I've been told there to ping @indepndnt.

indepndnt commented 2 months ago

Thank you for your review, @ezio-melotti. As you note, without any change the donor list would be expected to only grow longer forever. It was for this reason that a few years ago we decided to base the published donors list on donations from the prior three years. This is introduced in the paragraph directly above the list (emphasis added): "[...] Donors are listed in descending order using an algorithm that takes into account both the amount of donation and the age of the donation, based on all donations received within the past three years."

Aside from the age cutoff and the source data no longer coming from a flat text file, the donors list generation has been largely unchanged (as far as I can tell, this is long before my time) since the early 2000's. I took this docstring from the comments in the original script.

    def ranking(self):
        """
        Utilities for calculating the ranking on the donation page
        This code was originally written by Marc-Andre Lemburg

        Use exponential decay for ranking of donations.
        After an idea by Tim Peters. Thanks, Tim !
        """
        start = time.monotonic_ns()
        logger.info("Ranking donations ...")
        donation_amounts = {}

        for record in get_honor_roll_data(max_age=365 * 3):
            donation_amounts.setdefault(record.name, []).append((record.amount, record.age))

        self.ranked_donations = []
        LN2 = math.log(2)
        exponent = -LN2 / self.donation_halflife
        for donor, donations in donation_amounts.items():
            rank = 0.0
            donor_total = 0.0
            for amount, days_old in donations:
                donor_total += amount
                rank += amount * math.exp(exponent * days_old)
            if rank:
                self.ranked_donations.append((rank, donor_total, donor))

        self.ranked_donations.sort(reverse=True)
        end = time.monotonic_ns()
        logger.info("Ranked donations in %sms", f"{(end-start)//1000:,}")
ezio-melotti commented 2 months ago

Thanks for confirming that this is indeed intentional -- I missed the bit about the three years cutoff. I think this issue can be closed then.