colav / impactu

Colav Impactu Issues and Documentation
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

duplicados en openalex con diferente acceso ¿cómo hago el merge? #198

Open omazapa opened 4 months ago

omazapa commented 4 months ago

@restrepo

tenemos 45 trabajos que tiene el mismo doi y tienen datos diferentes en muchos de ellos, como por ejemplo el tipo de acceso

[{'_id': 'https://doi.org/10.1007/jhep12(2022)072',
  'unique_docs': ['https://openalex.org/W4307536594',
   'https://openalex.org/W4313533668'],
  'count': 2},
 {'_id': 'https://doi.org/10.1007/s00132-004-0689-1',
  'unique_docs': ['https://openalex.org/W1502732708',
   'https://openalex.org/W3209524113'],
  'count': 2},
 {'_id': 'https://doi.org/10.1007/s00209-022-03183-5',
  'unique_docs': ['https://openalex.org/W3132483183',
   'https://openalex.org/W4315874871'],
  'count': 2},
 {'_id': 'https://doi.org/10.1016/j.acuro.2011.02.006',
  'unique_docs': ['https://openalex.org/W2045924096',
   'https://openalex.org/W1976554246'],
  'count': 2},
 {'_id': 'https://doi.org/10.1016/j.gaceta.2008.02.001',
  'unique_docs': ['https://openalex.org/W2615553976',
   'https://openalex.org/W2108243875'],
  'count': 2},
 {'_id': 'https://doi.org/10.1016/j.gaceta.2009.09.004',
  'unique_docs': ['https://openalex.org/W2726623569',
   'https://openalex.org/W2168925627'],
  'count': 2},
 {'_id': 'https://doi.org/10.1016/j.jval.2011.05.023',
  'unique_docs': ['https://openalex.org/W2990441777',
   'https://openalex.org/W2147495668'],
  'count': 2},
 {'_id': 'https://doi.org/10.1016/s0210-4806(08)73813-0',
  'unique_docs': ['https://openalex.org/W2050021865',
   'https://openalex.org/W2615931363'],
  'count': 2},
 {'_id': 'https://doi.org/10.1016/s0213-9111(08)75346-9',
  'unique_docs': ['https://openalex.org/W2017162148',
   'https://openalex.org/W1497535761'],
  'count': 2},
 {'_id': 'https://doi.org/10.1038/nprot.nprot.2013.021',
  'unique_docs': ['https://openalex.org/W1972412013',
   'https://openalex.org/W3042181689'],
  'count': 2},
 {'_id': 'https://doi.org/10.1093/rheumatology/keac028',
  'unique_docs': ['https://openalex.org/W4207033704',
   'https://openalex.org/W4206560657'],
  'count': 2},
 {'_id': 'https://doi.org/10.1097/sap.0000000000002796',
  'unique_docs': ['https://openalex.org/W3181179560',
   'https://openalex.org/W4225700545'],
  'count': 2},
 {'_id': 'https://doi.org/10.1157/13088853',
  'unique_docs': ['https://openalex.org/W2615011954',
   'https://openalex.org/W2049146451'],
  'count': 2},
 {'_id': 'https://doi.org/10.1157/13093203',
  'unique_docs': ['https://openalex.org/W2040340068',
   'https://openalex.org/W2614864245'],
  'count': 2},
 {'_id': 'https://doi.org/10.15446/rsap.v18n6.38871',
  'unique_docs': ['https://openalex.org/W2623451974',
   'https://openalex.org/W2591517564'],
  'count': 2},
 {'_id': 'https://doi.org/10.15446/rsap.v19n2.55175',
  'unique_docs': ['https://openalex.org/W2790802674',
   'https://openalex.org/W2787566479'],
  'count': 2},
 {'_id': 'https://doi.org/10.15446/rsap.v19n4.51787',
  'unique_docs': ['https://openalex.org/W3048263530',
   'https://openalex.org/W2798112881'],
  'count': 2},
 {'_id': 'https://doi.org/10.1590/0102-311x00210715',
  'unique_docs': ['https://openalex.org/W2531731619',
   'https://openalex.org/W2528247587'],
  'count': 2},
 {'_id': 'https://doi.org/10.1590/s0034-89102007000600011',
  'unique_docs': ['https://openalex.org/W2142408218',
   'https://openalex.org/W2058675092'],
  'count': 2},
 {'_id': 'https://doi.org/10.1590/s0034-89102011000200022',
  'unique_docs': ['https://openalex.org/W1543770047',
   'https://openalex.org/W2164679287'],
  'count': 2},
 {'_id': 'https://doi.org/10.1590/s0080-62342010000300041',
  'unique_docs': ['https://openalex.org/W1583264502',
   'https://openalex.org/W2149644857'],
  'count': 2},
 {'_id': 'https://doi.org/10.1590/s0104-11692005000700012',
  'unique_docs': ['https://openalex.org/W2107473236',
   'https://openalex.org/W1551743574'],
  'count': 2},
 {'_id': 'https://doi.org/10.1590/s0104-11692009000700020',
  'unique_docs': ['https://openalex.org/W1520876499',
   'https://openalex.org/W1966147312'],
  'count': 2},
 {'_id': 'https://doi.org/10.1590/s0104-11692010000700013',
  'unique_docs': ['https://openalex.org/W1540758269',
   'https://openalex.org/W1979430436'],
  'count': 2},
 {'_id': 'https://doi.org/10.1590/s0104-11692011000700003',
  'unique_docs': ['https://openalex.org/W1747505348',
   'https://openalex.org/W1825181596'],
  'count': 2},
 {'_id': 'https://doi.org/10.1590/s0104-11692011000700007',
  'unique_docs': ['https://openalex.org/W1968932567',
   'https://openalex.org/W3204795003'],
  'count': 2},
 {'_id': 'https://doi.org/10.1590/s0124-00642007000200007',
  'unique_docs': ['https://openalex.org/W3046651959',
   'https://openalex.org/W1838395798'],
  'count': 2},
 {'_id': 'https://doi.org/10.1590/s0124-00642007000300011',
  'unique_docs': ['https://openalex.org/W1887510815',
   'https://openalex.org/W1949974986'],
  'count': 2},
 {'_id': 'https://doi.org/10.1590/s0124-00642008000500012',
  'unique_docs': ['https://openalex.org/W2796230821',
   'https://openalex.org/W1904402773'],
  'count': 2},
 {'_id': 'https://doi.org/10.1590/s0124-00642009000300015',
  'unique_docs': ['https://openalex.org/W1900646461',
   'https://openalex.org/W3036622919'],
  'count': 2},
 {'_id': 'https://doi.org/10.1590/s0124-00642009000400015',
  'unique_docs': ['https://openalex.org/W1915016137',
   'https://openalex.org/W3214572249'],
  'count': 2},
 {'_id': 'https://doi.org/10.1590/s0124-00642010000200010',
  'unique_docs': ['https://openalex.org/W2165486408',
   'https://openalex.org/W3145703764'],
  'count': 2},
 {'_id': 'https://doi.org/10.1590/s1020-49892009001100006',
  'unique_docs': ['https://openalex.org/W2954909943',
   'https://openalex.org/W2164293129'],
  'count': 2},
 {'_id': 'https://doi.org/10.1590/s1413-86702010000200008',
  'unique_docs': ['https://openalex.org/W4252944204',
   'https://openalex.org/W2147785704'],
  'count': 2},
 {'_id': 'https://doi.org/10.1590/s1980-220x2016042103264',
  'unique_docs': ['https://openalex.org/W2773432127',
   'https://openalex.org/W3010833465'],
  'count': 2},
 {'_id': 'https://doi.org/10.16899/jcm.1249428',
  'unique_docs': ['https://openalex.org/W4324128271',
   'https://openalex.org/W4328051144'],
  'count': 2},
 {'_id': 'https://doi.org/10.17151/eleu.2019.20.7',
  'unique_docs': ['https://openalex.org/W4291926945',
   'https://openalex.org/W3001072186'],
  'count': 2},
 {'_id': 'https://doi.org/10.26620/uniminuto.praxis.12.13.2012.90-103',
  'unique_docs': ['https://openalex.org/W4310968005',
   'https://openalex.org/W1922220308'],
  'count': 2},
 {'_id': 'https://doi.org/10.31910/rudca.v20.n1.2017.66',
  'unique_docs': ['https://openalex.org/W4323238724',
   'https://openalex.org/W2763377433'],
  'count': 2},
 {'_id': 'https://doi.org/10.33571/rpolitec.v16n32a5',
  'unique_docs': ['https://openalex.org/W3117087949',
   'https://openalex.org/W4238761844'],
  'count': 2},
 {'_id': 'https://doi.org/10.5212/olharprofr.v.23.15985.',
  'unique_docs': ['https://openalex.org/W4242961356',
   'https://openalex.org/W4285581946'],
  'count': 2},
 {'_id': 'https://doi.org/10.5546/aap.2011.519',
  'unique_docs': ['https://openalex.org/W2022615942',
   'https://openalex.org/W1599526842'],
  'count': 2},
 {'_id': 'https://doi.org/10.5546/aap.2012.e47',
  'unique_docs': ['https://openalex.org/W1977954869',
   'https://openalex.org/W3150152710'],
  'count': 2},
 {'_id': 'https://doi.org/10.55467/reder.v4i2.50',
  'unique_docs': ['https://openalex.org/W3045413655',
   'https://openalex.org/W3209167116'],
  'count': 2}]
omazapa commented 4 months ago

si lo dejo sin actualizar como esta en este momento no van a aparecer algunos ids de trabajos de openalex.

https://github.com/colav/impactu/issues/184

restrepo commented 4 months ago

@omazapa Se selecciona el que tenga el dato no vacío:

primary_location → source → issn_l

image

y los otros se remueven de la base de datos