Add article relationships identified by MGC

ryscher commented 1 year ago

Ted Haberman identified more article relationships that should be added to datasets. Add these to the system.

File is in email from Ted on September 18.

ryscher commented 1 year ago

This may contain all of the entries from #2050.

ryscher commented 1 year ago

Follow up with Ted when this is done.

DragosIorgulescu commented 11 months ago

A couple of notes:

lookup existing dryad records by DataDOI
the ArticleDOI values need to become primary article relations
the ArticleDOI needs to be stripped of http & domain name -- start with the number
Add the ISSN value in if it does not exist already

DragosIorgulescu commented 10 months ago

Script proposal to be run:

content = CSV.read(path, headers: true)
updated_issn_identifiers = []
updated_identifiers_article_doi = []

content.each do |row|
  doi = row['DataDOI']
  identifier = StashEngine::Identifier.find_by(identifier: doi)

  puts "Found Identifier for #{doi}" if identifier
  next unless identifier

  unless identifier.publication_issn.blank?
    StashEngine::InternalDatum.create!(
      stash_identifier: identifier,
      data_type: 'publicationISSN',
      value: row['ISSN']
    )

    updated_issn_identifiers << identifier.id
    puts "Updated missing ISSN for #{doi} to #{row['ISSN']}"
  end

  primary_article_doi = row['articleDOI'].gsub('https://doi.org/', '')
  next unless identifier.publication_article_doi != primary_article_doi

  resource = identifier.resources.first
  next unless resource

  puts "Creating related identifier for #{doi} with article DOI #{primary_article_doi}"
  related_identifier = resource.related_identifiers.build(
    related_identifier_type: 'doi',
    work_type: 'primary_article',
    related_identifier: primary_article_doi
  )
  related_identifier.save!

  updated_identifiers_article_doi << identifier.id
end; puts "Updated ISSN for: #{updated_issn_identifiers} & primary article for #{updated_identifiers_article_doi}"

Items left:

[x] local testing & validation
[x] data verification & sanitization

datadryad / dryad-product-roadmap

Add article relationships identified by MGC #2927