Closed anshulxyz closed 2 years ago
@kelson42 I'm gonna need some directions to review this. Let me know when you have a few minutes to discuss it.
Hi @rgaudin , I have updated the script as the per feedback.
I had missed the https://github.com/openzim/cms/pull/14#discussion_r723402447
So I updated the commit and (force) pushed
how come the generated output only have 25 rows while the downloaded JSON appears to have 57 .zim.torrent entries in the first row
Because I am clubbing together entries like
wikipedia_en_all_maxi_2020-12
wikipedia_en_all_maxi_2020-06
wikipedia_en_all_maxi_2021-02
and getting the collective score for the wikipedia_en_all_maxi
how come the generated output only have 25 rows while the downloaded JSON appears to have 57 .zim.torrent entries in the first row Because I am clubbing together entries like and getting the collective score for the
wikipedia_en_all_maxi
Yeah that makes sense ; so my question would thus be ; how come we have so few results ? The JSON from that request seems to only provide 100 results while we have more than a thousand ZIM files. Is it capped? We obviously need compute this for all ZIMs
@rgaudin if I wanted to submit changes to the script, how should I go about it? Do you want me to submit a PR, or something else (like TBD)?
I want to fix the https://github.com/openzim/cms/pull/14#discussion_r724839900
@anshulxyz, if it's just about that one thing, we can live with it and when time comes to reuse this code, it can be fixed then. If you have additional contributions, open another PR.
Thanks again for your work.
This is regarding issue #11
For calculation of the score, I am using the formula
Where:
x
is the input rankmin
is the minimum rank in the seriesmax
is the maximum rank in the seriesy
is the resulting rescaled scoreMy resulting output file is like this
output.csv
how to run
output.csv
file.