UKDRI / ECR-Informatics-Committee

A repo for the UK DRI Early Career Researchers Informatics Committee. Containing useful information such as the committee charter, and kanban boards for our projects.
https://ukdri.github.io/ECR-Informatics-Committee/
1 stars 0 forks source link

questions/activity related to reproducibility matters for connectome booth #5

Closed CAmaat closed 9 months ago

H-Mateus commented 1 year ago

I'm thinking it makes sense to have a scale for responses based on their familiarity:

How familiar are you with the following: Q1. Version control (e.g. Git/GitHub/GitLab) Q2. Unit tests for software development Q3. Unit tests for data integrity/analysis consistency (e.g. testdat in R or TDDA in Python) Q4. Containerisation (e.g. Docker/Singularity) Q5. Virtual Environments (e.g. conda/renv) Q6. Workflow managers (e.g. Nextflow/WDL/Snakemake) Q7. Continuous integration/Continuous delivery (CI/CD)

H-Mateus commented 1 year ago

Could also be good to have a free text field for people to add anything else they think is relevant/missing

I'm wondering if it'd be good to gauge how interested people are in learning more about these topics too 🤔 Maybe if there's strong interest in particular areas, we could try and put on some training events down the line

In terms of actually getting answers, I suppose the simplest thing would be to have a QR code to a Google form and/or a laptop/tablet with it loaded at the table for people to fill in on the day

CharlotteAnne commented 1 year ago

I guess I'm curious what the aim of the activity will be: a) to get people thinking about reproducibility b) to gather info for future events (eg. gauge interest) c)....?

H-Mateus commented 1 year ago

I think it'll depend a bit on where people are at, but if we can gauge that most people are at least aware of these areas, and a good number are interested on learning how to incorporate them into their research, then I reckon the committee is well-placed to try and support and encourage that development (especially with the help of some higher ups)

On the other hand, if almost none of them know anything about them, and they don't care, then I guess we know there's an uphill struggle... And I can know to go and cry in a corner 😅

I'm no expert software developer myself, but I get the impression there are a lot of good practises that aren't typically done in research that we could learn from (maybe we could even go looking for some pure software dev types/companies to offer some training 🤷)

Either way, I guess it's hard to know where we stand without gauging what people already know and what they'd be interested in learning, or at least that was my thinking

CAmaat commented 1 year ago

hmmm I actually almost feel like this could have been a question on the interest form... Is there still a way to get it on there? Then on the poster/booth we only have one QR code that is linked to the interest form that will be sent out over email. I think a lot of people do not read their email, so it'd be good to get people fill out the form on the spot

H-Mateus commented 1 year ago

Sure, if it hasn't been sent out yet, I don't see why it couldn't be added 👍

And yeah, I agree on having a way to prompt people to fill in the form at the booth will be important to getting more responses, maybe one of us could have a laptop with the form open at the table if people don't want to use the QR code for whatever reason 🤔

CAmaat commented 1 year ago

In order to gauge what people are currently focused on, we could also have a big blank sheet on which we build up a mind map throughout connectome. Key words could be: data analysis, software development, hardware? then people can jot down unique words that come to mind...

I'd be curious to see what people come up with! Whatever they do not come up with should probably become a focus area?

CharlotteAnne commented 1 year ago

I think the steve has probs already mailed about the survey and there's already a lot in there - there is a question about what areas you'd like to develop and one of them is reproducibility. How about you have these things up on a board and emoji stickers and pens so we can get some kind of fun sentiment analysis going on? maybe some of it people are really into and some of it people consider a waste of time? I'm very much in favour of all of it and use it regularly, but as a postdoc realising that there really is very little reward to putting the time into it, others may feel this way too...

H-Mateus commented 1 year ago

I worry with a mind map that people might just go blank/not think of certain things on the spot, but I like the idea of a sentiment analysis! If we leave some space for people to write down additional things they think of, and maybe also ask about reasons people aren't using these things (time, training, a PI who cares even a bit about their science not being crap, etc), that could give a pretty good overview of where people are at with this stuff 🤔

It also occurred to me to maybe add a question on literate programming (e.g. R Markdown/Quarto/Jupyter notebooks)

I agree on the lack of reward for engaging in good practise generally, but I get the sense things are moving in a positive direction generally (pushes for open access publishing + data sharing and such from funders, Plan S, etc), admittedly at an infuriatingly slow pace though I think it's often appreciated by PIs with informatic knowledge/PIs that aren't general dumdums as well, so one could kinda treat it like a litmus test (i.e. would you really want to work at a lab that isn't going to value your skills and effort in this domain, to say nothing of harbouring some vein hope they might acutally support you in continuing to do it)

I also personally think that part of the selfish argument for doing these things is that informaticians can build up a public portfolio evidencing their skills in a way that is more likely to be appreciated by industry/competent employers, a lot more so than dusty papers where we're often relegated to middle authors anyway 🤷 That way, one has somewhere to go if they fully despair at the state of academia and/or can't find a competent PI to have them

Not that I've spent far too much time wallowing in despair myself, no ma'am, I'm definitely a well-adjusted happy bundle of optimism 🙃

CAmaat commented 1 year ago

I was just thinking... Is it a weird idea to combine this with asking people about what they would like to see at a DRI Hackathon? We could have a sheet with, pens, emoji stickers etc and we can steer a bit by already writing down "setting up DRI reproducibility workflow". This doesn't entirely capture it, but the only thing we have not asked the community yet is about ideas for problems to solve at a hackaton. Any ideas will likely reflect the defined, complex problems that people are currently working on and whatever is on the board could be a conversation starter (i.e. when you talk with someone, ask them about what their experience is with making their work reproducible). What do you think?

H-Mateus commented 1 year ago

100% agree! Perhaps we could brainstorm some hackathon projects, or perhaps broad domains of focus ourselves so they have a starting point, and ask folk for their thoughts/anything they'd want to add 🤔

CAmaat commented 1 year ago

Let's do that! I can arrange a big sheet and some markers... as for emoji stickers... we can probably still resort to amazon? We will anyways all be there on the ECR day I think, so we can assemble all during the unofficial ECR poster session, create the starting point and get to know each other a bit in person :)

CAmaat commented 1 year ago

nvm I just see your slack message!!

H-Mateus commented 1 year ago

In the interest of prosperity, I'll report that the little survey went pretty well!

We got nearly 30 responses and I had some interesting discussions with lots of folk about it. Subjectively, people seemed keen on learning more about this stuff, especially the aspects they weren't familiar with, so training could be a good option to look into

Also had a lot of excitement for the idea of a code review platform, so I'm playing around with some of the GitHub organisation features to see what could be put together

I've put the data here and added a little report here