Open Badaro opened 2 months ago
@aliquanto3 mentioned to me on Discord that there's some players with more than 200 cards listed in the CSV, and given this is a Duel Commander tournament this should've been impossible for a single player.
This confirms that the anonymization logic in place is trivial and causes duplicates, and means that besides the issues mapping the Top 8 players we also have no way to match those duplicates correctly if we stick to the CSV.
As @Aliquanto3 said on X, I'd be able to publish a Jupyter Notebook for scrapping decklists with people aliases. Rounds won't be usable anyway except for top8 where a bit of prediction could help rebuild their path in the rounds.
Looks like Manatraders eventually fixed the Standings, but not the Pairings. For now I'll remove those and scrape the remaining information.
Manatraders has changed their website to hide the names of the players asides from the Top 8. https://www.manatraders.com/tournaments/53
That by itself is not a problem... except you can no longer find the Top 8 players in the decklists CSV. My guess is that the anonymization logic was done in a hurry and they're hiding the player names in the CSV but not the standings, but this makes the CSV fairly useless as you can't match the Top 8 players to their deck.
The fact that you can get the player names by simply opening the decklists is strong evidence that this implementation was not very well thought out. As an example, just by clicking on the decklist page you can find out that "L**g" is "Lordegg".
For now I disabled the scraper, but there's a few options to go:
Option 2 is likely the best solution for now, it's easy to implement and ensures compatibility, but I'll probably wait a few weeks to see if there'll be more changes in the website. Considering how trivial it is to bypass the current anonymization logic I'm assuming that'll be necessary.