After adding some index, MBZDB enjoys a much better efficient. I dropped all the tables without any record. Some queries are implemented to insight the data.

4. Validate some distribution

NULL year or 2017/2019 must be wrong # Delete in further analysis

SELECT COUNT(re.name), rc.date_year FROM release AS re, release_country AS rc WHERE rc.release = re.id GROUP BY (rc.date_year);

Detach unreseasonable date

SELECT COUNT(name), begin_date_year, end_date_year FROM artist WHERE begin_date_year > 1950 AND begin_date_year < 2017 AND (end_date_year > 1990 OR end_date_year IS NULL) GROUP BY(begin_date_year);

Varioulsy types of place

SELECT COUNT(*), place_type.name FROM area, place_type WHERE area.type = place_type.id GROUP BY(area.type);

the US has 30 Washington Country...

SELECT COUNT(gid), name FROM area GROUP BY(name) ORDER BY(COUNT(gid)) DESC

SELECT * FROM place SELECT * FROM place_type`

tailaijin / Data_Mining-Final_Project_Music

@ Tailai Jin Cut database into reasonable size. Track table has more than 0.2 Billion records. #2

4. Validate some distribution

NULL year or 2017/2019 must be wrong # Delete in further analysis

Detach unreseasonable date

Varioulsy types of place

the US has 30 Washington Country...