pySCENIC is a lightning-fast python implementation of the SCENIC pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.
I want to use pyscenic CLI for auc-matrix and regulon generation and to use R for downstream analysis.
I know how to generate auc-matrix (just run 'pyscenic aucell') but I only have a vague idea about how to get the 'tf-target list' (regulons).
I want to use pyscenic CLI for auc-matrix and regulon generation and to use R for downstream analysis. I know how to generate auc-matrix (just run 'pyscenic aucell') but I only have a vague idea about how to get the 'tf-target list' (regulons).
By reading issues: https://github.com/aertslab/pySCENIC/issues/126 and tutorials: https://rawcdn.githack.com/aertslab/SCENIC/0a4c96ed8d930edd8868f07428090f9dae264705/inst/doc/importing_pySCENIC.html I supposed that I should generate a 'gmt' file and use R to read it.
I used the following code:
to generate the regulon gmt file the 1st field of the gmt file is tf name, and the 2nd field is 'tf=XXXX' , but 3rd filed is 'score=X.XXXX', which goes against the gmt file format definition at: https://software.broadinstitute.org/cancer/software/gsea/wiki/index.php/Data_formats#GMT:_Gene_Matrix_Transposed_file_format_.28.2A.gmt.29 where it says that only the 2nd column could be used for comments and annotations
It caused some problems when using the R GSEABase::getGmt() functions. It couldn't read the file in properly.
is this a bug about the gmt file output or is there anything I've missed?
Thanks in advance for any comments.