tidyomics / tidyomicsBlog

the manifesto, workshops and tutorials of the tidy transcriptomics
Other
12 stars 5 forks source link

To do list for the ecosystem #9

Open stemangiola opened 4 years ago

stemangiola commented 4 years ago

Hello @mblue9 I write here the things that need to be done for any of the packages in the tidytranscriptomics ecosystem. Feel free to add points (in general feel free to do anything, all the repositories are yours as well!), and only do the tasks you feel/have time for, the more the merrier! :)

tidybulk: improve documentation


stemangiola commented 4 years ago

Hello @mblue9, between the repos tidySCE tidySE and tidyseurat, I remember you did some changes. I was wondering if whatever change you did on the repo you replicated on the others.

So many repos I am starting to get very confused :)

mblue9 commented 4 years ago

So many repos I am starting to get very confused :)

Me too :)

Hello @mblue9, between the repos tidySCE tidySE and tidyseurat, I remember you did some changes. I was wondering if whatever change you did on the repo you replicated on the others.

yes, as far as I know I did & I had a look at the list of commits I made to the repos & seems to be the case.

stemangiola commented 3 years ago

It would be nice to add youtube thumbnails to the workshops README, since for the majority of workshops we have videos of.

How to place a thumbnail, I'm not sure if there are better ways https://stackoverflow.com/questions/62533346/asciidoc-how-to-embed-youtube-videos-in-github-flavored-asciidoc

Bioc2020 https://www.youtube.com/watch?v=5Cgnpwv19Jk BiocEurope2020 Coming BiocAsia maybe we don't have it RPharma - ? ABACBS - ?

stemangiola commented 3 years ago

Hello @mblue9,

I thought of few things that could be a priority now that the tidytranscriptomics is establishing (maybe others that you can add). Maybe let me know in case you want/have time to take care of some parts of this

mblue9 commented 3 years ago

Happy to help here as I get time.

Build a root service repository where we define all generics (methods) shared across our tidy repositories, so we can load them all without them overwriting each other

Sounds good, if you could make a list/checklist of which methods are shared that would need to be included, when you've time, that could be a good start? as I'm not sure which ones are shared.

Improve documentation for all tidytranscriptomics

Happy to help with that. Do you know of ones that definitely need improvement? As maybe could make a checklist to know where to start there?

I don't know very well how to clean all different methods for different classes. If you do? to our methods, the printout is very ugly

I'm afraid that is beyond my current knowledge, you're much more of a package developer expert than me :) Do you know of any packages that have to also handle this type of scenario (plyranges?) that we could learn from?

In the documentation exposed some internal code with our methods (e.g. differential abundance), so to communicate to users what we are doing in the backend. But the majority of methods are missing this transparency

Just to check do other packages do this? And do you think it's quite important? As I've wondered if doing this would be hard to maintain (could be easy to forget to update the code in the documentation to reflect changes in the function). And also could get quite long with different code for different options e.g edgeR methods, deseq2, voom with/without treat in test_differential_abundance. I might be wrong but my feeling is anyone who's interested in the code would check the source and we could just make sure to describe which underlying methods/parameters of relevance are used in the @param part (e.g. treat being used in test_above_log2_foldchange)?

(maybe less of a priority) Prettify our messages with a nice library I found https://github.com/r-lib/cli, that is used in tidyverse messages and other modern packages

Looks nice!

stemangiola commented 3 years ago

if you could make a list/checklist of which methods are shared that would need to be included

Sure I will do. FYI when we load libraries in any order we should not see the message "library X has function Y from library Z"

Do you know of ones that definitely need improvement? As maybe could make a checklist to know where to start there?

That's the problem :) I never went function by function to see which one needs work. A checklist of methods is definitely the way to go.

Do you know of any packages that have to also handle this type of scenario

Most methods work in different classes. It is a very common problem, that I never addressed properly XD. Seurat has all methods working for an assay or S4 Seurat object. dplyr has methods working for tbl, grouped tbl etc. I think is just a little detail we are missing.

Just to check do other packages do this? And do you think it's quite important? As I've wondered if doing this would be hard to maintain (could be easy to forget to update the code in the documentation to reflect changes in the function).

Good point. So yes as far as all the feedback I receive it is important. Everyone including reviewers and workshop audience (maybe not rightfully) is scared of being detached from the analyses (although Seurat has the same problem, but we are dealing with 15 yo workflows). So everyone says it is fair enough if we document well the backend processes. I don't think anyone would take looking at the source code as a solution.

And also could get quite long with different code for different options e.g edgeR methods, deseq2, voom with/without treat in test_differential_abundance. I might be wrong but my feeling is anyone who's interested in the code would check the source and we could just make sure to describe which underlying methods/parameters of relevance are used in the @param part (e.g. treat being used in test_above_log2_foldchange)?

Yes, the point here is that we don't have to go to 100% of perfection. We just have to add enough information to give a good glimpse of key examples of how the main workflows are composed, then different flavours will be easy for the user to imagine. Differential transcript abundance is the most complex one and I would say is 80% done. The others, e.g. deconvolution and testing have far fewer branches in the backend.

In case you know any master student who would like to be involved to give a hand and be included in the next tidySE tifySCE publication would be great. I am always trying to expand our great team. So far no success. :(

mblue9 commented 3 years ago

Ok great, thanks for the info!

In case you know any master student who would like to be involved to give a hand and be included in the next tidySE tifySCE publication would be great. I am always trying to expand our great team. So far no success. :(

I don't at the moment but I'll certainly keep an eye out.

stemangiola commented 3 years ago

Cool, I will start soon organising, and putting hands on.

stemangiola commented 3 years ago
  • Improve documentation for all tidytranscriptomics

I started here. https://github.com/stemangiola/tidybulk/pull/188