gitter-lab / SINGE

Gene regulatory network reconstruction from pseudotemporal single-cell gene expression data
MIT License
11 stars 5 forks source link

Is it possible to have a more detailed intro for the inputs in 'mat' format ? #72

Closed christear closed 8 months ago

christear commented 8 months ago

Thanks very much for developing this useful tool and I would like to use it to construct GRN with my own data. Since I'm not familiar with Matlab, I tried to run it with bash script. However, I did not find any detailed intros for the input files in 'mat' format. I knew it should be a saved Matlab workspace file with several variables. So I tried to dissect the content of these input 'mat' files by myself, where I found two variables in 'X_SCODE_data.mat', 'X' and 'ptime', one variable in 'gene_list.mat', 'gene_list'. I also found 'X' in 'X_SCODE_data.mat' is a sparse matrix, 'ptime' is a matrix, and 'gene_list' is a cell. I created a similar 'mat' file with similar organization with my own data but got warnings: "Unable to read some of the variables due to unknown MAT-file error." I'm more sure whether there are more hidden informations for these variables in 'mat' format ? Is it possible to directly take more common files types, such as 'txt/csv' as inputs ?

agitter commented 8 months ago

Hi @christear, sorry to hear that the expected input format is not clear. Currently we only have this partially described in the readme

  • data - Path to matfile with ordered single-cell expression data (sparse matrix X), pseudotime values (array ptime), optional indices of regulators (array of index values regix), and optional branching information (matrix branches). For example, the data in data1/X_SCODE_data.mat represents a linear trajectory, and data_bifurcated/X_data_bifurcated.mat represents a branching trajectory with two branches.
  • gene_list - Path to file containing list of gene names corresponding to the rows in the expression data matrix X in Data (e.g., data1/gene_list.mat)

Was that similar to the .mat file you created? It sounds like you have the expected data structures in place.

What version of MATLAB are you using? I'm curious if the error could have to do with the MAT file format instead of the contents.

SINGE currently cannot take .txt or .csv files as input, but we realize that would improve usability. No one is actively developing the software right now. If we resume development, that modification would be the first thing to add.

christear commented 8 months ago

Thank you very much. I eventually found the problem and run SINGE successfully with my data.

agitter commented 8 months ago

I'm happy to hear it's working for you. Please let us know if you have other problems or have suggested changes for the documentation.