medialab / halexp

medialab's expert search engine poc
GNU General Public License v3.0
4 stars 0 forks source link

Fix co-authors bug. #21

Closed jimenaRL closed 7 months ago

jimenaRL commented 7 months ago

It seems to be solved, the problem arose because we where using a list of authors ids/names and a list of author's labs structures ids/names that were not meant to match 1 to 1 (authFullName_s, authIdHal_i, labStructName_s and labStructId_i).

Now using authIdHasPrimaryStructure_fs and authFullNameId_fs keys.

I had to take some arbitrary decisions since often HAL data is not complete, for instances authors without hal id, authors without lab, etc.:

  1. Authors without authId_i are dropped and not considered for a given document.
  2. Authors have a list of labs structures to which they can be attached.
  3. If an author doesn't have any attached lab structure, the list is empty.
  4. Sciences Po labs structures are informed in the 'signature' of each author which is also a list.