SAFEHR-data / uclh-research-discovery

SAFEHR Health Data Cards
https://safehr-data.github.io/uclh-research-discovery/
0 stars 6 forks source link

Gather researcher feedback #5

Closed tcouch closed 8 months ago

tcouch commented 9 months ago

Carry out interview and run through of the project submission process with researchers to get their feedback on the first draft of the site.

tcouch commented 9 months ago

A semi structured approach to trialling the UCLH research discovery site with researchers.

Introduction

(to be read out to the researchers)

Thank you for agreeing to take part with trialling the site and giving us your feedback. First we’ll briefly describe the purpose of the site.

UCLH research discovery is a new site developed as part of a collaboration between UCLH CRIU and UCL ARC with the aim of helping the public (and other researchers) to better understand the research being conducted at UCLH, and how patient data is being used in the course of that research. Researchers will be able to add their own content to describe their work including details such as how patient data was used, how it was kept safe, the analysis that was carried out, and the outcomes that were produced. By publishing synthetic datasets and code in UCL’s research data repository, you can even show people how to repeat the steps you took to carry out the research, and perhaps help other researchers to repeat your approach with other datasets.

Contributions to the site are made through GitHub, a popular online platform for collaborating on software. While some experience would be beneficial, we have written guidance describing how to submit content and how the peer review process works.

Today, we’d like to trial this approach with you and find out what we need to do to improve the concept and the support provided.

Part 1: Requirements and opinions

To gather feedback on the overall concept of the site and the type of content it should contain. Again, the questions are supposed to be open-ended, but the checklist items can be used to prompt for their opinion about specific aspects.

Content

Question: What sort of content would you expect to see on a site that aims to share information about research projects with the public?

Checklist:

Concerns

Question: Do you have any concerns about sharing information about your project?

Checklist:

Part 2: Technical skills

To assess researchers’ experience with the tools they’d need to use in order to contribute to the site. The questions are supposed to be open ended and the checklist is there as a series of additional prompts we might use to make sure we cover all the areas.

Git and GitHub

Question: Tell me about your level of familiarity and experience with GitHub.

Checklist:

Markdown

Question: Have you created a document with markdown before? If so, can you briefly describe what you’ve done and any tools or packages you’ve used alongside it?

Checklist:

General

Question: Do you have any coding experience? If so, can you briefly describe the kind of things you’ve used it for?

Checklist:

Part 3: Contribution walkthrough

Either in person or sharing their screen, researchers follow the guidance on submitting a project. Explain that the focus is not on the quality of the content so just do the minimum. We give prompts if they get stuck, but otherwise just note down things like:

Once they make a pull request, carry out the review including:

Note how they respond and how they resolve these requests.

acholyn commented 9 months ago

Meeting done Dec 6 with Hrishee


Notes

He did data science at UCLH and also doing clinical but this has expired for the moment.

Part 1 - Requirements and opinions

Checklist:

features are generated from a lot of other tables to result in ~200 features which will output to 1

Checklist:

Part 2 - Technical Skills

Git and GitHub

Checklist:

Markdown

Checklist:

General

Checklist:

Part 3 - Contribution Walkthrough

Either in person or sharing their screen, researchers follow the guidance on submitting a project. Explain that the focus is not on the quality of the content so just do the minimum. We give prompts if they get stuck, but otherwise just note down things like:

We might need to make it clear for UCL login sometimes it doesn't work - sends me to a loop and just denies him. The data repo said they don't have an account for him and he should contact the institution - people may need to sign up for this if they don't already have an account made for them by UCL.

We should consider including some guidance for running the cloned repo locally so they can see their project and check it's to their satisfaction before opening the PR, this may also help with identifying eg broken links. It could be suggested just for those who are comfortable and hopefully the reviewers will catch the other bits?

could also take some inspo from https://uclhal.org/tutorials/onboarding/onboarding-1.html regarding style of instruction?

The following wasn't done due to errors on our part but he's happy to go ahead with it in future. Once they make a pull request, carry out the review including:

Note how they respond and how they resolve these requests.


Main takeaway: they'd like some help/guidance on how to synthesise their data and if we could encourage people to use the reforms then we can adhere to a growing international standard.

tcouch commented 9 months ago

Meeting 6th Dec with Elizabeth Brown and Samiran Ray

Requirements and Opinions

Elizabeth: It's a good way of showing patient involvement which is a good line that you need to put into grant applications. In terms of who the audience is, I think what you would expect to include would be quite different. Would there be a way to switch between professional and lay person views?

So on the one hand you have the average person who is a patient at UCLH and is worried about their data and wants to see what's being done with it. And then you have a researcher or someone who's interested in a particular area who wants to have a look at the exact code you're using and how you're treating certain variables.

Lay person summary

Samiran: I've not written one personally - we have stuff which gets a public press release so there are strap lines, that kind of thing. Then there are public facing databases like clinical-trials.gov where you have to register your project when you start it and provide details of the output which goes into a lot of detail and you have to answer a lot of questions, providing numbers etc.

Samiran: When you're writing lay summaries you tend to gloss over a lot of details, but you have to do that for certain professional audiences too to some degree. So it's more than just two audiences.

Open access data/code

EB: If someone was asking me to put code on a public website, I would feel worried that people would start nitpicking. You don't want to simplify things because you don't want people to come back and say "you haven't done this" or "this is wrong", and so almost out of a defensive instinct you end up putting in way too much detail because you feel exposed. I can't really imagine how I'd take a small part of what I've done (code) to demonstrate - you'd almost have to put the whole thing up, and then it would feel like "have a look at my methods and tell me if you think they're right".

SR: So, how would this be any different to publishing my code on GitHub?

EB/SR: We don't make our code available on GitHub

SR: My coding is nowhere near in terms of finesse...

TC: Is there anything that could convince you to change your mind?

SR: Somebody having a look at my code first

EB: And only at the end of your project (which is a long process) because you wouldn't want people taking advantage

Technical

Quarto: Both never used

GitHub

Can you explain the difference between Git and GitHub?

Have you ever used Git on your own machine?

Can you create, switch and merge branches?

Do you have a GitHub account?

Have you ever commented on an issue or pull request?

Have you ever created a pull request?

Do you know the difference between a public and private repository?

Markdown

Have you heard of markdown?

Do you feel comfortable writing a doc?

Coding

Preferred languages?

Have you ever used Jekyll?

Do you know what a YAML file is?

Runthrough

EB:

SR:

Snags

uy-rrodriguez commented 9 months ago

Meeting 15th Dec with Ruaraidh Campbell, clinician and researcher at UCLH.

Requirements and Opinions

Main thoughts about the platform:

UCLH, as a Research Hospital, needs to be transparent. This platform would help.

Has never heard of UCL Research Data Repository. Has never heard of Find a Study. Compared to this platform, Find a Study only has trials where people opt-in, where this platform would publish all research.

Lay person summary

Thinks Lay Summary is very important for this platform. Has not written Lay Summaries in the past but thinks it would be easy and wouldn't take him much time. Mentioned that tools like ChatGPT can assist.

Open access data/code

Happy to publish final code and results. Acknowledges there is a movement towards publishing code as they go, but is not confident yet.

Happy to publish anonymised data if allowed by data provider. Does not know how to make synthetic data, would need support, but would be keen. Wonders about options when data is provided in various formats (CSV, database, others).

Technical

Quarto: Used GitHub: Very competent, used many options. Thinks it's helpful to practice with a real project rather than tutorials.

GitHub

Can you explain the difference between Git and GitHub?

Have you ever used Git on your own machine?

Can you create, switch and merge branches?

Do you have a GitHub account?

Have you ever commented on an issue or pull request?

Have you ever created a pull request?

Do you know the difference between a public and private repository?

Markdown

Have you heard of markdown?

Do you feel comfortable writing a doc?

Coding

Preferred languages?

Have you ever used Jekyll?

Do you know what a YAML file is?

Runthrough

Given his experience, run-through was done reading the instructions and highlighting anything unclear or commenting on things. It didn't seem necessary to do the actions as they all were very clear to him and knew exactly how to do them.

Tabs

Likes the flexibility provided, even though they are more appropriate for skilled publishers. Advises on having a pre-set that people can delete.

uy-rrodriguez commented 9 months ago

Meeting 15th Dec with Matthew Wilson, clinical student and researcher at UCLH.

Requirements and Opinions

Expects major bodies (NIHR) to have similar patient facing websites to show research and results, funding goals. Would read results only if interested in the content of the research.

Expected functionalities: Search by topic and methodologies used to find previous similar work. See how people used the code to achieve their goals.

Has seen but never used UCL Research Data Repository. Has project published in Find a Study.

Lay person summary

Lay summaries done to what is mandated during project setup stage (e.g. NIHR lay summary). Confident to write lay summary at that stage. Lay description of results is not common, would require more effort to publish findings for general public, but agrees it would be interesting to have them. It would be good to include lay content (summary and results) creation as part of the normal research process.

Open access data/code

Happy to publish final code and results. Has used Mimic as Open source dataset. Concerns of timing, when to submit projects to the platform, although it's not a big issue as research is also published as pre print.

Data: As his work revolves around granular patient data, it would need support on the steps to prepare data for safely release it. Would be useful to have synthetic data but has no previous experience.

Technical

Quarto: Used GitHub: Medium level of comfort using it. Only used private repositories. Likes the peer review process.

GitHub

Can you explain the difference between Git and GitHub?

Have you ever used Git on your own machine?

Can you create, switch and merge branches?

Do you have a GitHub account?

Have you ever commented on an issue or pull request?

Have you ever created a pull request?

Do you know the difference between a public and private repository?

Markdown

Have you heard of markdown?

Do you feel comfortable writing a doc?

Coding

Preferred languages?

Have you ever used Jekyll?

Do you know what a YAML file is?

Runthrough

Will test and send feedback.

uy-rrodriguez commented 8 months ago

Summary

Feedback from users

We have a full spectrum of potential users, with some very comfortable uploading projects using the terminal (clone & push), and other for which this is all new.

The initial approach of defining custom tabs via a YAML file seemed fine for some of the more experienced users. The option to add or remove tabs as they wish is a nice flexibility, although some users mention it would be good to home a few "standard", so users visiting the website know what to expect.

In terms of writing Lay Summaries, some users have done it while others don't have any experience at all. There is some worry of publishing reduced or simplified versions of their methodologies and being "criticised". The use of tools like ChatGPT was mentioned, as something that could be recommended to users less experienced.

A similar feeling was shared about sharing code in progress.

Conclusions

  1. To cover users with low technical skills, the guidelines to upload projects will use the GitHub platform and contain screenshots of every step.
  2. For tabs, given it's possible to infer crucial front-matter code from presence of files in the folder, now the definition of tabs is inferred from all HTML and Markdown files in the project folder.
  3. Some solutions involve adding suggestions to the guidance for uploading projects, such as: a. Standard tabs for full details (not lay) or links to external, more complete code or descriptions. b. Mention tools that can help with lay summaries, like ChatGPT or existent examples to be highlighted. c. Describe that projects can be uploaded while in progress and encourage to update them later on.
  4. Projects could have a "status" field to mark them as "In progress", "Completed", etc.
  5. Work on GitHub actions that raise alerts if certain elements (e.g. mandatory tabs) are not present. This can be raised as errors that block merging the PR, warnings, or comments in the Pull Request discussion.

To-do

  1. @tcouch to create a Pull Request template, describing what a peer review would contain.
  2. @tcouch to update menu Contribute > Overview: Add details on what people can upload and mention to only upload what they feel comfortable with. Encourage to update projects multiple times.
  3. @tcouch to add project status field.
  4. @uy-rrodriguez to re-create PR with test project after @tcouch confirms the new GH actions are ready.

For the future