CAIDA / catalog-data

Repo which holds some panda solutions and papers
3 stars 6 forks source link

Duplicate objects in catalog-data and catalog-data-caida #585

Open aaronchan32 opened 1 year ago

aaronchan32 commented 1 year ago

Remove duplicate objects in catalog-data and leave the object in catalog-data-caida.

Example: catalog-data: https://github.com/CAIDA/catalog-data/blob/master/sources/software/ioda.json catalog-data-caida: https://github.com/CAIDA/catalog-data-caida/blob/master/sources/software/ioda.json

trdavidt commented 8 months ago

Near matches are in the markdown file attached. Generate the markdown using the attached script (run python remove-dups.py > duplicates_log.md from the directory that contains ./catalog-data and ./catalog-data-caida). Looking through all the near matches, I was only able to locate 2 duplicate objects in total.

duplicates_log.txt

remove-dups.txt

I attached all the files as .txt since GitHub doesn't support .md or .py attachments...