Questions for roster_design_part_3_player_allocation

Answers to the questions I posted (roster_design_part_3_player_allocation):

1) For players that play in multiple positions:

if visitor player in position 1 played any games in any other position throughtout the season, the total of zone starts he was on the ice for are calculated with the use of np.where. It is the sum of zone starts for all positions he played in.A player might play in position 6 (goaltender position) when his team is trailing in the final minutes of a game and they decide to pull the goaltender to add an additonal skater.

tzvp1 = total zone starts for visitor player 1 zvp1 = zone start player 1 zvp2 = zone start player 2

If visitor player 1 played in positon 2, the total zone starts for that player (tzvp1) will be the sum of zone starts in position 1 and 2. I apply this for all 6 positions for both home and away teams.

code: dm['tzvp1'] = np.where((dm['VTeamCode'] == dm['VTeamCode']) & (dm['VPlayer1'] == dm['VPlayer2']), dm['zvp1'] + dm['zvp2'], (np.where((dm['VTeamCode'] == dm['VTeamCode']) & (dm['VPlayer1'] == dm['VPlayer3']), dm['zvp1'] + dm['zvp3'], (np.where((dm['VTeamCode'] == dm['VTeamCode']) & (dm['VPlayer1'] == dm['VPlayer4']), dm['zvp1'] + dm['zvp4'], (np.where((dm['VTeamCode'] == dm['VTeamCode']) & (dm['VPlayer1'] == dm['VPlayer5']), dm['zvp1'] + dm['zvp5'], (np.where((dm['VTeamCode'] == dm['VTeamCode']) & (dm['VPlayer1'] == dm['VPlayer6']), dm['zvp1'] + dm['zvp6'], dm['zvp1'])))))))))

Overall zone starts:

Zone starts of each player has been calculated only for his team being home or away for the season, since home zone start value and visitor zone start value were used. The total zone starts of each player is the total of zone starts he participated for a whole season. Thus, the sum of both home and away zone starts.

dm['zplyr1'] = np.where((dm['Season'] == dm['Season']) & (dm['HTeamCode'] == dm['VTeamCode']) & (dm['HPlayer1'] == dm['VPlayer1']), (dm['tzhp1'] + dm['tzvp1'])/dm['gp1'], (np.where((dm['Season'] == dm['Season']) & (dm['HTeamCode'] != dm['VTeamCode']) & (dm['HPlayer1'] != dm['VPlayer1']), dm['tzhp1']/dm['thgp3'], (np.where((dm['Season'] == dm['Season']) &(dm['VTeamCode'] == dm['HTeamCode']) & (dm['VPlayer1'] == dm['HPlayer1']), (dm['tzvp1'] + dm['tzhp1'])/dm['gp1'], dm['tzvp1']/dm['tvgp1'])))))

I apply this code for all 6 roster positions.

2) ## allocate players per position to forward lines and defensive pairings

generate a variable that will allocate all players to their respectful line. Position 1 is the centre position of forward lines. If total zone start is the highest amongst players per position, that player is assinged to the top line. If total zone start is the lowest amongst players per position, that player is assinged to the 4th line. For the two values left, the player that has the highest total zone start will be allocated to the 2nd line. The other player will be assigned to the 3rd line.

First, I generate a column that contains the max value for visitor player 1:

dm['vmax1'] = dm.groupby(['Season', 'VTeamCode'])['tzvp1'].transform(max)

Second, I generate a column that contains the min value for visitor player 1:

dm['vmin1'] = dm.groupby(['Season', 'VTeamCode'])['tzvp1'].transform(min)

Third, if total zone start of a player is not equal to the max or min value of player 1, compare it with the next value that is not min or max.

dm['vc'] = np.where((dm['Season'] == dm['Season']) & (dm['VTeamCode'] == dm['VTeamCode']) & (dm['tzvp1'] == dm['vmax1']), 1, (np.where((dm['Season'] == dm['Season']) & (dm['VTeamCode'] == dm['VTeamCode']) & (dm['tzvp1'] == dm['vmin1']), 4, (np.where((dm['Season'] == dm['Season']) & (dm['VTeamCode'] == dm['VTeamCode']) & (dm['VPlayer1'] != dm['VPlayer1']) & (dm['tzvp1'] != dm['vmax1']) & (dm['tzvp1'] != dm['vmin1']) & (dm['tzvp1'].shift() != dm['vmax1']) & (dm['tzvp1'].shift() != dm['vmin1']) & (dm['tzvp1'] > dm['tzvp1'].shift()), 2, 3)))))

I repeat this code for all 6 positions for both home and away teams (12 observations).

games played

create variable that counts the amount of games each player from the visitor team in position 1 played.

dm['vgp1'] = dm.groupby(['Season', 'VTeamCode', 'EventNumber', 'VPlayer1'])['GameNumber'].transform('count')

create variable that counts the amount of games each player from the home team in position 1 played.

dm['hgp1'] = dm.groupby(['Season', 'HTeamCode', 'EventNumber', 'HPlayer1'])['GameNumber'].transform('count')

overall player allocation

Each player has been assigned to their respectful roster position based on his team being home or away for the season. The overall roster position of each player is the mean of both home and away position.

c = centre position vc = visitor centre hc = home centre gp1 = games played for visitor player 1 thgp1 = total home games played tvgp1 = total visitor games played

dm['c'] = np.where((dm['Season'] == dm['Season']) & (dm['HTeamCode'] == dm['VTeamCode']) & (dm['HPlayer1'] == dm['VPlayer1']), (dm['hc'] + dm['vc'])/dm['gp1'], (np.where((dm['Season'] == dm['Season']) & (dm['HTeamCode'] != dm['VTeamCode']) & (dm['HPlayer1'] != dm['VPlayer1']), dm['hc']/dm['thgp1'], (np.where((dm['Season'] == dm['Season']) &(dm['VTeamCode'] == dm['HTeamCode']) & (dm['VPlayer1'] == dm['HPlayer1']), (dm['vc'] + dm['hc'])/dm['gp1'], dm['vc']/dm['tvgp1'])))))

I apply this for all 6 poisitions: c, rw, lw, dr, dl, g.

I have stored the dm file as csv: "dm.to_csv('player_allocation.csv', index='False', sep=',')"

What is the next step:

a) should I run the whole season data frame and then try to apply the roster model?

b) should I try to run the roster model for the 2 games only? The 2nd game only had one goal so the allocated players are 2 per position.

stephtselios / nhl_roster_design

Questions for roster_design_part_3_player_allocation #1

games played

overall player allocation