neuroquery / pubget

Collecting papers from PubMed Central and extracting text, metadata and stereotactic coordinates.
https://neuroquery.github.io/pubget/
MIT License
20 stars 12 forks source link

Preserve texts related to public datsets #51

Open leej3 opened 1 month ago

leej3 commented 1 month ago

Supercedes #48 on behalf of @agt24 so that I can keep track of it in gitpay.

Original description: This issue stems from recent conversations between @jeromedockes @koudyk @jbpoline @adelavega and myself.

The goal is to make it possible/easier to label text that specifies public datasets: both those that were downloaded and used in the paper’s analysis and those that were collected/created in the work described in the paper and deposited in a public repository.

In the examples we’ve looked at, these bits of text tend to be inside tags that pubget current strips away for readability. I've included two examples below from the same article.xml file from PMC9622880 which is also attached

Example 1 on Line 137, the dataset is referenced inside an tag:

These neural and behavioral data have been made publicly available as a large-scale database of autobiographical memory (https://osf.io/exb7m/)

Example 2 on line 339, the dataset is referenced inside a tag:

<notes notes-type="data-availability">
  <title>Data availability</title>
  <p>Video features, memory features, and fMRI data generated in this study have been deposited in a 

repository on the Open Science Framework under access link https://osf.io/exb7m/. Raw memory videos and memory geocoordinate information are protected and are not available due to data privacy laws. The graph data generated in this study are provided in the Source Data file. Source data are provided with this paper.

Hopefully, this can be accomplished with some minor modifications to the text_extraction.xsl stylesheet.

I'll try to add a few more example articles to this issue when I can.

leej3 commented 1 week ago

@Precious-Macaulay any progress on this?

Precious-Macaulay commented 1 week ago

@Precious-Macaulay any progress on this?

Yes I sent a gitpay proposal no response yet check your gitpay account

leej3 commented 1 week ago

Odd, nothing showed up. Could you do that again please?

Precious-Macaulay commented 1 week ago

yes i have done that again check