VRichardJP / LoLAnalyzer

Using machine learning to find the best pick in a draft (League of Legends)
MIT License
44 stars 5 forks source link

I have made some modification that I am not sure is correct #10

Closed godfatherlmh closed 5 years ago

godfatherlmh commented 5 years ago

I have download data from 9.7 and 9.6 from all regions of master to challenger but the number of "data_#.csv" in "extracted" is still 11 . I don't know whether this is normal but I have changed the keep_for_testing value to 1 as your suggested. I am still downloading matchs from 9.5 to 9.1 to see whether there is a change. Also the overall_acc is around 0.5 at the end and I have no idea it is right or wrong since I am new to neural network.

VRichardJP commented 5 years ago

Thanks, I'll try to check that In the sample I have made 2 years ago, I used around 750,000 games (which makes like 80~90 files) so I guess you don't have enough data for the network you're using.

You can try to reduce the network parameters in the run.py script to get a better result. Basically, big networks can reach a bigger accuracy, but also need more data to make progress.

If you look at Networks.py, I have reported some intermediate results for like 150000 games https://github.com/vingtfranc/LoLAnalyzer/blob/a9a2bbcbfa5e8bddcc3fbc02e79383b77ef17694/Networks.py#L24-L34 As you can see, with this amount of data, bigger networks are really inefficient.

So maybe you can try to use 'BR' mode with small parameters. You have no choice but to test multiple things (there is no science that can give you the perfect parameters)

Few info:

VRichardJP commented 5 years ago

Thank you for the changes, I fixed some things on my side, see a6c54afa6dcc70fff816822c978deb74f5063ff9

Unrelated to what I said, I think the network is not learning because your configuration is incorrect. In your config.ini file, I think the champion list is displayed like this:

[CHAMPIONS]
aatrox = Aatrox
ahri = Ahri
...

But this part should associate champions with an ID like this:

[CHAMPIONS]
aatrox = 266
ahri = 103
...

In Riot's API only the IDs are used. At the data extraction phase I convert this ID back. In your case, you are missing the ID<->champion information so I guess your "extracted" games are just filled with 0.

To fix this, simply run the ConfigUpdater.py script once again (you can keep the old one to copy/paste your changes). Then delete the extracted directory and extracted.txt and start again from the extraction step (delete also the testing/training/data of ABR_TJMCS just to be sure)

godfatherlmh commented 5 years ago

Yeah I kind of hard code that part by finding the number of the new champ and combine it with the specific part [CHAMPIONS] in the old config.ini for ease. Also, the name of nunu is changed as well but I omitted those changes while I create the pull request. I am also curious is there a way to speed up the download process of data from each patch since I want to see if the analysis is independent of player level; but even the time take for analyzing diamond division for NA server take 40x more times than that of from grandmaster/master/challenger.

VRichardJP commented 5 years ago

The download process is limited by your API-key download rate. I guess you have a developer key so you can do than 0.83 request per second. As my application is registered on Riot's developer portal, I have an API key that can do 50 requests per second. Maybe you should try to get one. Note that the output log does not print all the requests, so when you see that a game has been downloaded, it may have required multiple requests (depending on how the game as been found). Btw, each server is independent so you will download games faster if you load from all servers.

As for the slow download speed. Have you run the PlayerListing.py after you've added diamond players in your config.ini? The DataDownloader.py script simply goes through the list of players the previous script has created. If there is almost no diamond player there, it's not surprising it won't find many games. Of course, since there are way more diamond players than master/challenger, it's takes a lot of time to list all the players or find all the games (I think I never tested below platinium, it would have taken ages)