aertslab / create_cisTarget_databases

Create cisTarget databases
37 stars 8 forks source link

Creating a cisTarget database for Zebrafish #39

Open RcuIT opened 11 months ago

RcuIT commented 11 months ago

Hi I managed to run create_cistarget_motif_databases.py for zebrafish and it generated three . feather files, zf.genes_vs_motifs.rankings.feather, zf.genes_vs_motifs.scores.feather and zf.motifs_vs_genes.scores.feather. How can I run create_cross_species_motifs_rankings_db.py ? because create_cistarget_motif_databases.py output did not generate cisTarget motifs vs regions or genes rankings databases. how can I continue with this? is it a must to have cross_species databases to run GRN ? Appreciate any response in this regard. thank you.

ghuls commented 11 months ago

For pySCENIC you only will need zf.genes_vs_motifs.rankings.feather. Cross-species it not necessary nowadays as it was in the past as we have way more motifs than we had in the past.

RcuIT commented 11 months ago

For pySCENIC you only will need zf.genes_vs_motifs.rankings.feather. Cross-species it not necessary nowadays as it was in the past as we have way more motifs than we had in the past.

Thanks for the clarification. I have another question. What should I pass to "org" parameter in the initializeScenic (). It says it needs, 'org' to be one of: mgi, hgnc, dmel. In the case of zebrafish, how could I specify the org?

ghuls commented 11 months ago

SCENIC is deprecated, use pySCENIC instead: https://pyscenic.readthedocs.io/en/latest/installation.html#docker-podman-and-singularity-apptainer-images pySCENIC does not need org as you just give a motif to TF annotation file with zebrafish genes as argument in the ctx step.

RcuIT commented 11 months ago

Thanks for the reply. I was trying to find a way to generate a motif to TF annotation file with zebrafish genes. The I found this database created by @stanaka6. I ran the following script, but got an empty .csv file. singularity run -B /home/roshanpe/Seurat/data:/data aertslab-pyscenic-0.12.1.sif pyscenic ctx /home/roshanpe/Seurat/data/24_expr_mat.adjacencies.tsv /home/roshanpe/Seurat/data/zf.genes_vs_motifs.rankings.feather --annotations_fname /home/roshanpe/Seurat/data/Zebrafish_Motif2TF_db_ready_for_SCENIC.tbl --expression_mtx_fname /home/roshanpe/Seurat/data/24.exp.tsv --mode "custom_multiprocessing" --output /home/roshanpe/Seurat/data/regulons.csv --num_workers 6

while running the script, it shows the following for several genes. 2023-07-24 12:48:01,796 - pyscenic.transform - WARNING - Less than 80% of the genes in Regulon for si:dkey-253d23.9 could be mapped to zf.genes_vs_motifs.rankings. Skipping this module.

any suggestions to overcome this issue? Thanks

ghuls commented 11 months ago

while running the script, it shows the following for several genes. 2023-07-24 12:48:01,796 - pyscenic.transform - WARNING - Less than 80% of the genes in Regulon for si:dkey-253d23.9 could be mapped to zf.genes_vs_motifs.rankings. Skipping this module. any suggestions to overcome this issue? Thanks

This means that your gene names in the database don't match the gene names in your expression matrix.

RcuIT commented 11 months ago

Thanks for the reply. Could you please provide me with steps to make a custom motif2TFs for zebrafish. I checked resources.aertslab.org_cistarget_motif2tf_motifs-v9-nr.mgi-m0.001-o0.0.tbl.txt file. I saw in the issue page you suggested replacing rat orthologs for mouse genes. I did try to do it for zebrafish. I downloaded zebrafish orthologs for mouse genes using ensembl. The above mouse txt file has around 163339 listed under the gene_name column. There I can find repeating of the same gene many times. Also some zebrafish genes are not having mouse ortholog and vice versa. How I can now replace zebrafish orthologs for mouse genes in the gene_name col. any special procedure to do this? The excel file here https://upenn.app.box.com/file/1267957948488?s=ky3gu1ur4bl02n4c4jez3n33o8r8olvy contains zebrafish genes, mouse gene and genes taking from gene_name col from the above tbl file. I would appreciate it if you could provide me with some info to replace zebrafish ortholog for mouse genes. Best, Roshan ____ Roshan Priyarangana Perera, Ph.D (he/him/his) Postdoctoral Research Scientist

Department of Cell and Developmental Biology

Biomedical Research Building II/III

University of Pennsylvania

Philadelphia, PA 19104, USA

Primary Contact Email : @. @.> Research lab : https://www.rajlab-neuro.com/ https://www.rajlab-neuro.com/*

Social Media linkedin : https://www.linkedin.com/in/roshan-priyarangana-perera-448ba340/ https://www.linkedin.com/in/roshan-priyarangana-perera-448ba340/ ResearchGate : https://www.researchgate.net/profile/Roshan-Perera-3 https://www.researchgate.net/profile/Roshan-Perera-3

On Wed, 26 Jul 2023 at 11:05, Gert Hulselmans @.***> wrote:

while running the script, it shows the following for several genes. 2023-07-24 12:48:01,796 - pyscenic.transform - WARNING - Less than 80% of the genes in Regulon for si:dkey-253d23.9 could be mapped to zf.genes_vs_motifs.rankings. Skipping this module. any suggestions to overcome this issue? Thanks

This means that your gene names in the database don't match the gene names in your expression matrix.

— Reply to this email directly, view it on GitHub https://github.com/aertslab/create_cisTarget_databases/issues/39#issuecomment-1651995735, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH4WDBZ5GMOFS7TRUIONSKDXSEW2VANCNFSM6AAAAAA2GHQBIY . You are receiving this because you authored the thread.Message ID: @.***>