RUC-MSc-CS-CIT-2024 / portfolio_subproject_1

Portfolio Subproject made for the CIT 2024 course
0 stars 0 forks source link

feat: import additional data #17

Closed MarLaslo closed 1 month ago

MarLaslo commented 1 month ago

11

Don't mind the DO, it should not be there.

Update works all the data is updated, all the empty fields and N/A are converted to null.

Removed $ and commas from boxoffice so it is an integer now.

I have added poster attribute to the media table, as I saw in the source data paper that it is a requirement to have it.

image
MarLaslo commented 1 month ago

Will fix the stuff!

MarLaslo commented 1 month ago
* `language` - Is related to a release (there should be a table where )

Language is in omdb is not a code but full words. I think we should decide if we want to have Language in code or in words. But I think it is waste of time to try to match the codes with language, if this is where you are pointing.

I will skip this until we discuss.

MarLaslo commented 1 month ago

[DONE]

poster - should be a promotional media of type poster (you can just set it for the main release) rated - the age rating of the movie (should be on the release of the movie, can see we forgot to add the column) released - the imdb data only has release year, update the release matching the year with the right date or add new release if year doesn't match or there are multiple in the same year production - this looks like its the production company for a media

[TO BE DONE] imdbrating - Should be added as a new score with source set to 'IMDb' ratings - create scores based on the values in the ratings column (looks like it already has a source and a value) imdbvotes - we should add an additional column to score for vote_count metascore - like with imdbrating (source = 'metacritic')

[IGNORED] language - Is related to a release (there should be a table where )

MarLaslo commented 1 month ago

/* IDEA What if we create a table and have types in there?

related_media_category(categoryname, parentcategory)

We can have then images, websites, videos without parentcat and subcategories like poster, actor, director, premiere for images and etc... */

MarLaslo commented 1 month ago

and for the promotional media, should we then make this a task for later?

If we agree tomorrow on it then yes, but I think this can be for the future of the portfolio. This change would not require a lot of time Just creating a new table throwing in the types and parents Then if we populate the db we will need to tweak the type value of import, only if we decide on different names as website and poster.

Also thinking should we then remove the website from the media? as I assume it is a promotional media

MarLaslo commented 1 month ago

Table score is now updated with data from OMDB_DB

Did not use metacatscore and imdbrating as I was using rating which had the same values plus rotten tomatoes.

idax6797 commented 1 month ago

and for the promotional media, should we then make this a task for later?

If we agree tomorrow on it then yes, but I think this can be for the future of the portfolio. This change would not require a lot of time Just creating a new table throwing in the types and parents Then if we populate the db we will need to tweak the type value of import, only if we decide on different names as website and poster.

Also thinking should we then remove the website from the media? as I assume it is a promotional media

Yes, I agree. I think the website should be removed from the media, as it makes more sense for it to be categorized under promotional media. If we decide to go with this.