oudalab / fajita

Event Data Tagging Tool
MIT License
7 stars 3 forks source link

Create wiki interface #203

Closed YanLiang1102 closed 6 years ago

YanLiang1102 commented 6 years ago
  1. which name should show , the Arabic or English
  2. There is no name and wiki for English in the scraping result
  3. For the start date and end_date, don't need this nested {$date: real date} should be just real date directly
  4. If the date is null, just put "", not null.
  5. The commit info will be stored in our sourceddictionary not the wikientity, since the order of the cameo_coding for each role is message to handle for the js model (since everything is ajax in even though we keep the role rank, it need to update all the role, when we just want to do one), but by keeping the wki_mongoid into sourceddictionary and role back we can do post processing to link it back later. (Also stored in sourcedictionary will be the same as before to edit and see who did what from the interface)
  6. And we store the time spent for the wiki entity tagging. @ahalterman
ahalterman commented 6 years ago
  1. I'd say show Arabic if available, otherwise use the English name.
  2. I'll take a look and make sure it's there and well-labeled in the JSON.
  3. Okay. Should I change that in how the data gets formatted from the scraper?
  4. Okay.
  5. Do you think we should have an explicit link between the CAMEO code added by the annotator and the wiki title they coded? I think it could be good, since I could see updating the wiki field in the future and then having the number and order be different from the CAMEO codes. Should we add a "wiki_role" field to the CAMEO code that has the name + start date + end date?
  6. Great. Very helpful.
YanLiang1102 commented 6 years ago

@ahalterman

  1. yeah if you can change that from the scraper that will be great, otherwise I need to write regex stuff to get rid of them from the data you give me.
  2. good point, yeah we can store the rolename in the dic,
YanLiang1102 commented 6 years ago

post processing data:

  1. either do the check of english or arabic result in the interface or preprocessed the data
  2. "role" need to change back to "title"---done
  3. "english" need to change back to "en"---done
  4. date need to change to the format of yyyy-mm-dd, and if some date has bad format just show them directly to the users.
YanLiang1102 commented 6 years ago

{'cameo_coding': [], 'harvested_from': [{'arabic': 'https://ar.wikipedia.org/wiki/', 'english': 'https://en.wikipedia.org/wiki/Haji_Bashir_Ismail_Yusuf'}], 'names': [{'arabic': '', 'english': 'Haji Bashir Ismail Yusuf'}], 'pageid': [{'arabic': '', 'english': 33606670}], 'tagged': False, 'wiki_roles': [{'arabic': [], 'en': [{'end_date': '1967-01-01', 'title': ' Minister of Health and Labour of the Somali Republic ', 'role_id': 1, 'start_date': '1966-01-01'}, {'end_date': ' mid July 1960', 'role': ' President of the Parliament of SomaliaSomali National Assembly ', 'role_id': 2, 'start_date': ' July 1 1960'}]}], 'wiki_scrape_date': '2018-01-12 15:01'} @khaledJabr