Closed tcouch closed 8 months ago
A semi structured approach to trialling the UCLH research discovery site with researchers.
(to be read out to the researchers)
Thank you for agreeing to take part with trialling the site and giving us your feedback. First we’ll briefly describe the purpose of the site.
UCLH research discovery is a new site developed as part of a collaboration between UCLH CRIU and UCL ARC with the aim of helping the public (and other researchers) to better understand the research being conducted at UCLH, and how patient data is being used in the course of that research. Researchers will be able to add their own content to describe their work including details such as how patient data was used, how it was kept safe, the analysis that was carried out, and the outcomes that were produced. By publishing synthetic datasets and code in UCL’s research data repository, you can even show people how to repeat the steps you took to carry out the research, and perhaps help other researchers to repeat your approach with other datasets.
Contributions to the site are made through GitHub, a popular online platform for collaborating on software. While some experience would be beneficial, we have written guidance describing how to submit content and how the peer review process works.
Today, we’d like to trial this approach with you and find out what we need to do to improve the concept and the support provided.
To gather feedback on the overall concept of the site and the type of content it should contain. Again, the questions are supposed to be open-ended, but the checklist items can be used to prompt for their opinion about specific aspects.
Question: What sort of content would you expect to see on a site that aims to share information about research projects with the public?
Checklist:
Question: Do you have any concerns about sharing information about your project?
Checklist:
To assess researchers’ experience with the tools they’d need to use in order to contribute to the site. The questions are supposed to be open ended and the checklist is there as a series of additional prompts we might use to make sure we cover all the areas.
Question: Tell me about your level of familiarity and experience with GitHub.
Checklist:
Question: Have you created a document with markdown before? If so, can you briefly describe what you’ve done and any tools or packages you’ve used alongside it?
Checklist:
Question: Do you have any coding experience? If so, can you briefly describe the kind of things you’ve used it for?
Checklist:
Either in person or sharing their screen, researchers follow the guidance on submitting a project. Explain that the focus is not on the quality of the content so just do the minimum. We give prompts if they get stuck, but otherwise just note down things like:
Once they make a pull request, carry out the review including:
Note how they respond and how they resolve these requests.
Meeting done Dec 6 with Hrishee
He did data science at UCLH and also doing clinical but this has expired for the moment.
Checklist:
features are generated from a lot of other tables to result in ~200 features which will output to 1
Checklist:
Git and GitHub
Checklist:
Markdown
Checklist:
General
Checklist:
Part 3 - Contribution Walkthrough
Either in person or sharing their screen, researchers follow the guidance on submitting a project. Explain that the focus is not on the quality of the content so just do the minimum. We give prompts if they get stuck, but otherwise just note down things like:
We might need to make it clear for UCL login sometimes it doesn't work - sends me to a loop and just denies him. The data repo said they don't have an account for him and he should contact the institution - people may need to sign up for this if they don't already have an account made for them by UCL.
We should consider including some guidance for running the cloned repo locally so they can see their project and check it's to their satisfaction before opening the PR, this may also help with identifying eg broken links. It could be suggested just for those who are comfortable and hopefully the reviewers will catch the other bits?
could also take some inspo from https://uclhal.org/tutorials/onboarding/onboarding-1.html regarding style of instruction?
The following wasn't done due to errors on our part but he's happy to go ahead with it in future. Once they make a pull request, carry out the review including:
Note how they respond and how they resolve these requests.
Main takeaway: they'd like some help/guidance on how to synthesise their data and if we could encourage people to use the reforms then we can adhere to a growing international standard.
Meeting 6th Dec with Elizabeth Brown and Samiran Ray
Elizabeth: It's a good way of showing patient involvement which is a good line that you need to put into grant applications. In terms of who the audience is, I think what you would expect to include would be quite different. Would there be a way to switch between professional and lay person views?
So on the one hand you have the average person who is a patient at UCLH and is worried about their data and wants to see what's being done with it. And then you have a researcher or someone who's interested in a particular area who wants to have a look at the exact code you're using and how you're treating certain variables.
Samiran: I've not written one personally - we have stuff which gets a public press release so there are strap lines, that kind of thing. Then there are public facing databases like clinical-trials.gov where you have to register your project when you start it and provide details of the output which goes into a lot of detail and you have to answer a lot of questions, providing numbers etc.
Samiran: When you're writing lay summaries you tend to gloss over a lot of details, but you have to do that for certain professional audiences too to some degree. So it's more than just two audiences.
EB: If someone was asking me to put code on a public website, I would feel worried that people would start nitpicking. You don't want to simplify things because you don't want people to come back and say "you haven't done this" or "this is wrong", and so almost out of a defensive instinct you end up putting in way too much detail because you feel exposed. I can't really imagine how I'd take a small part of what I've done (code) to demonstrate - you'd almost have to put the whole thing up, and then it would feel like "have a look at my methods and tell me if you think they're right".
SR: So, how would this be any different to publishing my code on GitHub?
EB/SR: We don't make our code available on GitHub
SR: My coding is nowhere near in terms of finesse...
TC: Is there anything that could convince you to change your mind?
SR: Somebody having a look at my code first
EB: And only at the end of your project (which is a long process) because you wouldn't want people taking advantage
Quarto: Both never used
Can you explain the difference between Git and GitHub?
Have you ever used Git on your own machine?
Can you create, switch and merge branches?
Do you have a GitHub account?
Have you ever commented on an issue or pull request?
Have you ever created a pull request?
Do you know the difference between a public and private repository?
Have you heard of markdown?
Do you feel comfortable writing a doc?
Preferred languages?
Have you ever used Jekyll?
Do you know what a YAML file is?
EB:
SR:
Meeting 15th Dec with Ruaraidh Campbell, clinician and researcher at UCLH.
Main thoughts about the platform:
UCLH, as a Research Hospital, needs to be transparent. This platform would help.
Has never heard of UCL Research Data Repository. Has never heard of Find a Study. Compared to this platform, Find a Study only has trials where people opt-in, where this platform would publish all research.
Thinks Lay Summary is very important for this platform. Has not written Lay Summaries in the past but thinks it would be easy and wouldn't take him much time. Mentioned that tools like ChatGPT can assist.
Happy to publish final code and results. Acknowledges there is a movement towards publishing code as they go, but is not confident yet.
Happy to publish anonymised data if allowed by data provider. Does not know how to make synthetic data, would need support, but would be keen. Wonders about options when data is provided in various formats (CSV, database, others).
Quarto: Used GitHub: Very competent, used many options. Thinks it's helpful to practice with a real project rather than tutorials.
Can you explain the difference between Git and GitHub?
Have you ever used Git on your own machine?
Can you create, switch and merge branches?
Do you have a GitHub account?
Have you ever commented on an issue or pull request?
Have you ever created a pull request?
Do you know the difference between a public and private repository?
Have you heard of markdown?
Do you feel comfortable writing a doc?
Preferred languages?
Have you ever used Jekyll?
Do you know what a YAML file is?
Given his experience, run-through was done reading the instructions and highlighting anything unclear or commenting on things. It didn't seem necessary to do the actions as they all were very clear to him and knew exactly how to do them.
Likes the flexibility provided, even though they are more appropriate for skilled publishers. Advises on having a pre-set that people can delete.
Meeting 15th Dec with Matthew Wilson, clinical student and researcher at UCLH.
Expects major bodies (NIHR) to have similar patient facing websites to show research and results, funding goals. Would read results only if interested in the content of the research.
Expected functionalities: Search by topic and methodologies used to find previous similar work. See how people used the code to achieve their goals.
Has seen but never used UCL Research Data Repository. Has project published in Find a Study.
Lay summaries done to what is mandated during project setup stage (e.g. NIHR lay summary). Confident to write lay summary at that stage. Lay description of results is not common, would require more effort to publish findings for general public, but agrees it would be interesting to have them. It would be good to include lay content (summary and results) creation as part of the normal research process.
Happy to publish final code and results. Has used Mimic as Open source dataset. Concerns of timing, when to submit projects to the platform, although it's not a big issue as research is also published as pre print.
Data: As his work revolves around granular patient data, it would need support on the steps to prepare data for safely release it. Would be useful to have synthetic data but has no previous experience.
Quarto: Used GitHub: Medium level of comfort using it. Only used private repositories. Likes the peer review process.
Can you explain the difference between Git and GitHub?
Have you ever used Git on your own machine?
Can you create, switch and merge branches?
Do you have a GitHub account?
Have you ever commented on an issue or pull request?
Have you ever created a pull request?
Do you know the difference between a public and private repository?
Have you heard of markdown?
Do you feel comfortable writing a doc?
Preferred languages?
Have you ever used Jekyll?
Do you know what a YAML file is?
Will test and send feedback.
We have a full spectrum of potential users, with some very comfortable uploading projects using the terminal (clone & push), and other for which this is all new.
The initial approach of defining custom tabs via a YAML file seemed fine for some of the more experienced users. The option to add or remove tabs as they wish is a nice flexibility, although some users mention it would be good to home a few "standard", so users visiting the website know what to expect.
In terms of writing Lay Summaries, some users have done it while others don't have any experience at all. There is some worry of publishing reduced or simplified versions of their methodologies and being "criticised". The use of tools like ChatGPT was mentioned, as something that could be recommended to users less experienced.
A similar feeling was shared about sharing code in progress.
Carry out interview and run through of the project submission process with researchers to get their feedback on the first draft of the site.