TO-GCN step has almost no output

cyycyj commented 2 months ago

Dear Yaoming,

First of all, I would like to express my sincere appreciation for your excellent tool, TO-GCN. It has been really useful for my research.

Today, while I was working with TO-GCN_STAR, something strange happened:

(base) [c@n01 exp]$ Cutoff 5 01.TF.genes.exp.tsv
No. of TFs: 1076
No. of time points: 5

Cutoff value for your reference: 0.96 ~ 1.00
(base) [c@n01 exp]$ cat initial_seed.txt 
C02G002477
(base) [c@n01 exp]$ TO-GCN 5 01.TF.genes.exp.tsv initial_seed.txt 0.96
No. of TFs: 1076
No. of time points: 5
No. of initial TF seeds: 1
Cutoff: 0.96

Assigning levels for nodes in GCN by Breadth-First-Search (BFS) method......
Done!
(base) [c@n01 exp]$ l
total 120K
-rw-r--r-- 1  96K Jun 19 23:49 01.TF.genes.exp.tsv
-rw-r--r-- 1  11 Jun 19 23:50 initial_seed.txt
-rw-r--r-- 1  52 Jun 19 23:52 Node_level.tsv
-rw-r--r-- 1  63 Jun 19 23:52 Node_relation.csv
-rw-r--r-- 1  8.2K Jun 19 23:50 PCC_histogram.tsv
(base) [c@n01 exp]$ cat Node_*
TF_gene_ID      assigned_level
C02G002477      1
C13G001305      1
node_1 ID, node_2 ID, PCC_value
C02G002477,C13G001305,0.990954

Do you have any idea about it? And please check the 01.TF.genes.exp.tsv file here, I hope it can be helpful for you.

Best regards,

Andrew

01.TF.genes.exp.txt

cyycyj commented 2 months ago

Let me give an update about the detailed process:

Step 1: Identify the transcription factor (TF) genes and non-TF genes.

Step 2: Perform DEG analysis between 5 stages with DESeq2, using an FDR cut-off of 0.05 and a log2 fold change cut-off of 1.

Step 3: Extract the intersection genes that are both DEG and TF genes for further analysis, and then generate 01.TF.genes.exp.tsv (expression in TPM) based on this intersection. Genes with an average TPM of less than 0.5 are excluded.

petitmingchang commented 2 months ago

@cyycyj

From the output of level assignment, you only got one gene that directly connects to the seed TF. It is because there is no TF genes connected to them for building a TO-GCN. So I will suggest you to use all TF genes (not only DEGs) to run the TO-GCN again.

If it still now works, you may add more TF genes to the seed set.

petitming

cyycyj commented 2 months ago

Dear Yao-ming,

Thank you for your advice. I tried the TO-GCN pipeline again, not only using the entire TF genes expression matrix (01.TF.genes.exp.txt), but also using all genes from MFSelector_Type1.txt (derived from the MFSelector pipeline) as the initial seed. However, the problem still remains: Node_relation.csv contains no output except the header, and Node_level.txt only contains the genes from MFSelector_Type1.txt.

I have attached the corresponding files, and I really appreciate your help.

01.TF.genes.exp.txt MFSelector_Type1.txt MFSelector_Type2.txt Node_relation.csv Node_level.txt

petitmingchang commented 2 months ago

@cyycyj

I've tried the TO-GCN pipeline with your data and successfully got a TO-GCN with 8 levels. Here I used "C01G002267" as the seed only. The commands and output were listed below:

$./Cutoff 5 01.TF.genes.exp.txt No. of TFs: 1956 No. of time points: 5

Cutoff value for your reference: 0.94 ~ 0.98

$./TO-GCN 5 01.TF.genes.exp.txt seed.txt 0.9 No. of TFs: 1956 No. of time points: 5 No. of initial TF seeds: 1 Cutoff: 0.90

Assigning levels for nodes in GCN by Breadth-First-Search (BFS) method...... Done!

Node_relation_and_level.zip

petitming

cyycyj commented 2 months ago

Dear Yao-ming,

Thank you for your debugging assistance. I discovered that my issue might be caused by the complex gene IDs. When I use the simplified version of the gene IDs, as I provided to you, it works well.

By the way, could you please provide me, and anyone else trying to use this epic tool, with a more detailed guide on how to build a network in Cytoscape based on Node_level.tsv and Node_relation.csv? I found several misleading methods for building Cytoscape networks on the internet.

petitmingchang / TO-GCN_STAR-Protocol

TO-GCN step has almost no output #4