NovoNordisk-OpenSource / decentralized-tech-radar

Decentralized Tech Radar - ITU ISE 2024 Collaboration
GNU Affero General Public License v3.0
8 stars 3 forks source link

[PR4:1]feat: CSV content verifier (data integrity & duplicate removal) #91

Closed Slug-Boi closed 6 months ago

Slug-Boi commented 6 months ago

This PR contains a basic duplicate remover, that looks at the names of blips and their quadrants and removes lines from the CSV file if they match. This also includes a hard coded alternative names map that will catch variations of names and simplify them to catch duplicates (e.g. csharp, C#, CSHARP)

It also includes a data integrity verifier that will check if the CSV file header and data is correctly formatted.

The verifier will run when fetching and merging

Resolves #60 Resolves #75 Resolves #67