track_id/song_id handling

Looking at the unique tracks file (there are 1,000,000 entries) we can see there are 1,000,000 unique track_id's, and there are 999,056 unique song_id's. This holds true with Jose's point, however, this means that track_id's are vastly under reported in this dataset.

I'd suggest that we remove all track_id's corresponding to the same song. It will remove less than 0.1% of our data, and all confusion on this issue.