mawa00006 / Doping-Detection-Based-on-Publicly-Available-Competition-Data-in-Professional-Road-Cycling

0 stars 0 forks source link

Merge wiki_cases data and USADA data #2

Closed tony-hong closed 2 years ago

tony-hong commented 2 years ago

Task:

Processing steps of wiki cases:

  1. filter "event" by 'test', 'testing', 'admit', 'ban' ...;
  2. check "ARG1-PERSON" to get the names;
  3. filter names that are too long;
  4. link them to the rider page table
sama25100 commented 2 years ago

Sinnerdata: https://drive.google.com/drive/folders/1K7ehL7AGf94iLRAvwVyHNjgLVbxFHfpQ Google Colab: https://drive.google.com/drive/folders/1NgYu9JJfxq1bLBr1z-iNJEis3Dn4ECQX

sama25100 commented 2 years ago

I had already merged the two datasets but since a lot of the USADA riders were young or not high-level athletes, all but 30 riders got left out of the dataset since pro cycling stats did not have race data for them