presagia-analytics / ctrialsgov

Query Data from ClinicalTrials.gov
https://presagia-analytics.github.io/ctrialsgov/
Other
12 stars 3 forks source link

Updating the current downloaded dataframe #10

Closed OleksiyAnokhin closed 1 year ago

OleksiyAnokhin commented 2 years ago

Hi team, how can I add more data to your table. For instance, I would like to get information about Principal Investigator. Any quick way to do this?

"SponsorCollaboratorsModule":{
          "ResponsibleParty":{
            "ResponsiblePartyType":"Principal Investigator",
            "ResponsiblePartyInvestigatorFullName":"Are Annesønn Kalstad",
            "ResponsiblePartyInvestigatorTitle":"Medical Doctor",
            "ResponsiblePartyInvestigatorAffiliation":"Oslo University Hospital"
          },
kaneplusplus commented 2 years ago

Hi @OleksiyAnokhin. Do you want this specifically for Principle Investigator or in general? In either case, are you requesting that we do it (or provide direction on how to do it), or do you think you can do it and provide a pull request?

We don't have much of a process for this currently. @statsmaths and I prioritized the tables that we thought would have the most value and then thought we'd add more as needed. It would be great to get this input.

OleksiyAnokhin commented 2 years ago

@kaneplusplus , unfortunately I cannot do it myself. You and many other repo owners put together "the most common" columns and I did not find any good programmatic solution for extracting this information separately. I did it finally via one github dev library in R. I think the idea is interesting for me, because I wanted to connect PubMed with CT data and see who are principal investigators for CTs and which PM articles are related. Maybe you can put it in your to do list?

kaneplusplus commented 2 years ago

Sure. It looks like this information is included in the administrative tables. Are there other fields in clinicaltrials.gov that are not included that are of interest?

OleksiyAnokhin commented 2 years ago

@kaneplusplus, I tink this is it for now. Just think about linking such data with other datasets. That is where the value of the package. If it extracts data, which can be analyzed together with something else.

kaneplusplus commented 2 years ago

I think the focus of the package will always be querying clinicaltrials.gov databases. Users are welcome to create their own packages that integrate it's functionality in their own packages, which may join with other packages. There are a lot of other datasets to link to and it would be difficult to service all of them.