HackerSpace-PESU / Best11-Fantasycricket

Predicting the Best 11 for a fantasy cricket game
GNU Affero General Public License v3.0
24 stars 17 forks source link

Ignore retired players #41

Closed roysti10 closed 3 years ago

roysti10 commented 3 years ago

Describe the bug It is evident that retired players don't play anymore. The webcrawler still includes them which needs to be filtered

To Reproduce Steps to reproduce the behavior:

  1. Follow the instructions in README.md and notice once it starts collecting players. It can also be noticed in data_crawler/ids_names.csv.

Possible Solution The solution : in cralwer/cricketcrawler/spiders/howstat.py , in function parse_player

if retired == False:
          yield PlayerItem(name=url[url.find("?PlayerID=")+10:],gametype=gametype,folder=".",longname=name,retired=retired)

Screenshots Screenshot from 2020-11-14 13-51-43

Desktop (please complete the following information):

scientes commented 3 years ago

note from my side: i included the retired as im not sure that all i mark as retired actually are. Also you might notice that a player can ocurr up to three times in the csv due to there being a page for each player on test,T20,ODI if they plaeyed in them. there could be the possibility that a player might not be playing in one of those catergories but is still active in another.

roysti10 commented 3 years ago

i included the retired as im not sure that all i mark as retired actually are.

Could you elaborate on this? I didnt quite get you

Also you might notice that a player can ocurr up to three times in the csv due to there being a page for each player on test,T20,ODI if they plaeyed in them. there could be the possibility that a player might not be playing in one of those catergories but is still active in another.

I was actually gonna change this to one time when i got the time and delete the gametype column entirely. If he isnt active in the other formats , That should'nt matter cause his records will simply not be present in the respective format's folder. That shouldnt cause any problems.

scientes commented 3 years ago

i included the retired as im not sure that all i mark as retired actually are.

Could you elaborate on this? I didnt quite get you

i havent verfied that the data in the retired column is correct and that every player with retired=True is actually retired and i haven't checked if this also is the same for every gametype (meaning that when player is retired he is marked as such in every gametype)

roysti10 commented 3 years ago

i included the retired as im not sure that all i mark as retired actually are.

Could you elaborate on this? I didnt quite get you

i havent verfied that the data in the retired column is correct and that every player with retired=True is actually retired and i haven't checked if this also is the same for every gametype (meaning that when player is retired he is marked as such in every gametype)

Aah , then this issue can be dangerous if fixed, Ill verify this asap Ill add a wontfix label for now