Blackpassat / european_football

0 stars 1 forks source link

Data model recommendations/questions #1

Open arwhyte opened 5 years ago

arwhyte commented 5 years ago

Relationships

I could be wrong but I would think the following relationships hold:

M2M

M2M

I can be in the Premier League one year and then get relegated the next. I can also be a member of both the Premier League and the Champions League.

1:M

The match table should not have to hold any info regarding individual players. You retrieve the players via the teams that play in the match. Likewise, you retrieve the league(s) in the match via the teams. Then again, perhaps all matches are tied to a league in some manner (how are friendlies handled?) so perhaps including a league_id FK in the match is logical.

Properties

Blackpassat commented 5 years ago

I still have some problems regarding my data model. I have uploaded my database called "soccer.zip" to this repo. This is the original database I downloaded from kaggle.

league & team

I think that the relationship should be 0 or 1:M. Because there is no league that containing teams from multiple leagues like Champions League and friendlies in my database. There is only one league for each country. If a team got relegated, it simply disappeared in the next season.

match & players

I think the 'match' table should hold the squad list info. Because the squad list is always changing for different matches, even the matches in the same season. It would be difficult to retrieve the players info via the teams. I think the players is more related to matches rather than teams. Do I need to create a junction table for match and players?

match & league

I think the league info can be retrieved via the teams for different matches. Since a team only belongs to one league for my database. I don't quite understand why including a 'league_id' FK in the 'match' is logical. Another thing is that not all matches are tied to a league in my database. The 'match' table contains the match info for all the 11 leagues from season 2008/2009 to 2015/2016.

relationships for 'match'

I don't understand why the relationship between 'team' and 'match' is 1:M. I think it should be M:M because a match should have two teams, home and away, and a team can play in multiple matches. If the relationship for 'team' and 'match' is 1:M, would 'player' and 'match' also be 1:M?

other problems

arwhyte commented 5 years ago

league & team I think that the relationship should be 0 or 1:M. Because there is no league that containing teams from multiple leagues like Champions League and friendlies in my database. There is only one league for each country. If a team got relegated, it simply disappeared in the next season.

If that is a hard and fast rule (1 country, 1 league, n teams in country) than you can place league_id in the team table as a FK.

match & players I think the 'match' table should hold the squad list info. Because the squad list is always changing for different matches, even the matches in the same season. It would be difficult to retrieve the players info via the teams. I think the players is more related to matches rather than teams. Do I need to create a junction table for match and players?

Players are related to both teams and matches. Sounds like you need

match - match_squad - player

match_squad match_player_id (Django friendly PK) match_id team_id is_home_team TINYINT(1) values (0, 1) player_id . . .

relationships for 'match' I don't understand why the relationship between 'team' and 'match' is 1:M. I think it should be M:M because a match should have two teams, home and away, and a team can play in multiple matches. If the relationship for 'team' and 'match' is 1:M, would 'player' and 'match' also be 1:M?

Think of it this way. A team can play in many matches in a given season. But for each match in a given season you can only assign a single home team and a single away team.

Your data model screenshot defines the correct relationship between match and team.

other problems There is repetitive info in my tables like the category info for heritage sites. For example, seasons. The season can only be one from 2008/2009 to 2015/2016. If I didn't create an individual table to hold this info, would my score be affected?

You should have a season table.

Blackpassat commented 5 years ago

For the match-match_squad_player part, do you mean I should create a 'match_squad' table as a junction table? I understand the match_id, team_id and is_home_team fields, but I am a little confused about the match_player_id and player_id. Does the match_player_id mean the players played in that match, which should be 11 players? What about the player_id? I do not have information about players in a specific team. I only have information about players played in a specific match.

arwhyte commented 5 years ago

It's a typo. You spoke of squads so I suggested that you rename the table to match_squad. I neglected to rename match_player_id as match_squad_id. It's the Django friendly PK for match_squad.

Blackpassat commented 5 years ago

So there should be one match and one player in one row in the 'match_squad' table. Not like the original 'match' table where all the players for one match are placed in one row, right?

arwhyte commented 5 years ago

Correct.