About the threshold settings > Edges statistical testing "ON" at the Network inference tab

Dear Dr Cassan and authors. This is the wonderful software and user interface that I have been looking for. This is a neat suite for differentially-expressed gene analysis yet I have an issue on the use. I tried local (MacOS 10.14.6, with R 4.1.1, and Rstudio with R4.1.1) at MacBook pro, 15-inch, 2018, Core i7, 16G mem) and web version (https://diane.bpmp.inrae.fr/). However, DIANE always crushes at 'Edges statistical testing "ON"'. It worked well with hard thresholding, and the following network analysis. I would like to test try by step-by-step in R but I could not see the details of the R commands particularly how the DIANE makes the 'list' dataset when it incorporates the count data, gene information, and the experimental condition (and GO term files in later). If you can instruct me what might cause the crush at the Edges statistical testing, and how I can test locally by building the list datafile for DIANE, I would much appreciate it. Thank you! Again, yours is a super wonderful analysis suite. Best regards, Masato

Dear Dr Yoshizawa,

Thank you very much for your kind feedback about DIANE, and for reaching out with this question. Network thresholding via statistical testing is one of the most specific features of our suite and user feedbacks about it is very rare, so I would be glad to go to the bottom of this issue with you.

If you can instruct me what might cause the crush at the Edges statistical testing

To help you the best I can, I will start with some questions :

While using the local shiny app in RStudio, does a warning/error message appear in the R console at the problematic network thresholding step ? If so, what is its content? Is the crash hapeneing right away or does some computation happen before?
What is the number of input genes for network inference, and what is the number of regulators among those genes? Are the parameters used for network inference/testing the default ones or did you change their values?
To be sure to find the issue, the ideal would be for me to locally reproduce the crash. Would it be possible to provide me with the data (the expression file and regulators file, or only a subset of it), and the main steps/settings you used until you get to the crash oceane.cassan@cnrs.fr? Of course, I would keep this data strictly confidential and delete it as soon as this issue is closed.

how I can test locally by building the list datafile for DIANE

To use DIANE's functions in a script on your data, you will find useful information in this vignette : https://oceanecsn.github.io/DIANE/articles/DIANE_Programming_Interface.html

More precisely, the data needed to use DIANE's functions is :

A dataframe of counts
An optional dataframe for gene annotation
An optional dataframe for GO terms/genes matching

To see how those dataframes should look like, you can inspect the counts and annotations of the demo data of DIANE as follows :

library(DIANE)
# counts
data("abiotic_stresses")
abiotic_stresses$raw_counts
# annotation
data("gene_annotations")
gene_annotations$`Arabidopsis thaliana`

To create a dataframe from your expression csv or txt file, you can store it in a variable in R with counts <- read.csv(pathToYourExpressionFile, sep = ',', row.names = "Gene", header = TRUE, stringsAsFactors = FALSE, check.names = FALSE) (with the correct path and separator). Then, use your counts instead of abiotic_stresses$counts as an argument in DIANE's functions in the rest of the vignette for normalisation, differential expression, and so on. The same applies for gene annotations, that can be imported with read.csv and stored in a dataframe with gene IDs as rownames, and columns named "label" and/or "description".

The reference lists DIANE's functions and their required arguments in details.

To conclude, I think it would be easier if I had access to your data, as it would tell me if there is an internal problem in the thresholding function that only a developer can solve (Indeed, even if your try a step by step approach in a script, it will probably crash when calling the function test_edges without being more informative than the user interface crash. Also, depending on your proficiency in R, this step by step approach could also be time consuming).

I hope this helps, Best regards,

Océane

OceaneCsn / DIANE

About the threshold settings > Edges statistical testing "ON" at the Network inference tab #38