Open inigmatus opened 8 years ago
We can setup elasticsearch container to handle querying, and use a formula to calculate "trending" mods (i.e. mod downloads over a certain time period, week, month, etc).
What about creating a metric by combining pageviews and downloads in some way?
I believe the followers count has a significant effect on rankings, which makes sense. Download counts alone can be more of an indication of how many releases there are for a given mod, rather than actual unique users.
so working as designed. gotcha. ill drop the enhancement on this and leave this as a discussion for a few more days, unless anyone has any suggestions to change the ranking method.
Do we know how accurate the download counts are currently? I remember KerbalStuff had a big problem with download counts getting artificially inflating due to failed download restarts. I think SirCmpwn fixed that, but I don't know for sure.
There were problems in the past where download managers would supposedly be counted as hundreds or thousands of individual downloads (according to Majir), but that seems to have been fixed a long time ago.
For the past six months or so download counts have been very weird. There was a consistent count of 50-100 downloads per day of old versions (usually the oldest) of mods. Sometimes these old version downloads were significantly higher than for the latest version. It happened for many mods; there were a number of complaints about it in the KS thread. It seems unlikely that large numbers of users were downloading mods for years old versions of KSP.
IN the future popular mods should be rated by user ratings (see rating enhancement req)
Currently this is done via the total download count, which isn't terrible, but it is kinda bad as it does mean BDA and scatterer will likely stay there.
I think the best possible approach is likely a sliding window of download counts over a certain time period (let's say a week), but that would require a new database table, and I'm not sure how fast the query could be - it might have to be done in a batch job (which I'd like to avoid).
EDIT: Here's one approach to an implementation (sliding window) http://pastebin.com/xp4KmWib
This seems to be the actual logic now:
Score = 10*Followers + Downloads + NumReleases/5 + NumMedia, plus:
It looks complex, but the Downloads factor is going to dominate completely when looking at the biggest and most established mods. 100 downloads isn't very many, and 10 is trivial, so the penalty/bonus section can basically be ignored. Nobody's going to have a thousand releases or videos, so those can be disregarded as well. Scatterer has 1183 followers, which equals 11830 points, but even that is tiny compared to its 842961 downloads.
For the past six months or so download counts have been very weird. There was a consistent count of 50-100 downloads per day of old versions (usually the oldest) of mods. Sometimes these old version downloads were significantly higher than for the latest version. It happened for many mods; there were a number of complaints about it in the KS thread. It seems unlikely that large numbers of users were downloading mods for years old versions of KSP.
There were several major bugs with how DownloadEvent
rows were retrieved and updated that would have resulted in the wrong versions being incremented like that, found and fixed in #370 and https://github.com/KSP-SpaceDock/SpaceDock/pull/295#issuecomment-650461716 and KerbalStuff/KerbalStuff#121:
The download route previously retrieved
DownloadEvent.created
and checked whether it was more than 1 hour ago in Python code, but the check was somewhat wrong (it usedtimedelta.seconds
, which only goes up to 86399 and so could satisfy the limit when it shouldn't). Now it enforces this limit in SQL, correctly.Any subsequent downloads during the next 60 minutes, of any mod version get added to this
DownloadEvent
, regardless of whether the mod version matches or not. If in step one an older version has been downloaded, all downloads for the next 60 minutes get added to this event, even if the latest version has been downloaded. So the download counts of the download events don't really speak the truth. You can see this if you download the download stats of any mod, and look at the creation times of the download events. If the code would work correctly, download events for different mod versions should overlap sometimes, however they never do.
The combined effect of all the bugs would have been to make the download stats quite scrambled (though still accurate enough for mod scoring, the subject of this issue). Everything should be working well for about the past year, though.
Does anyone know how popular mods are currently ranked? It's not by downloads, or if it is, then the ranking must be taking place at intervals that are not instant.
How do we want to serve the list of popular mods going forward? Is it broken? Can it be better? What do you want to see?