jjakubassa / WDI-Project

0 stars 0 forks source link

Spotify_issue #5

Closed jjakubassa closed 8 months ago

jjakubassa commented 8 months ago

As can be seen in IR/data/output/debugResultsMatchingRule.csv_short there is an issue with the data reading (see image below). My guess is that we only get those strange Spotify titles, since those are the only ones with quotation marks. Since we use blocking based on the first few characters and wdc always has quotation marks, we only get the strange cases. Proposed Solution: strip quotation marks at every attribute.

image