aertslab / SCENIC

SCENIC is an R package to infer Gene Regulatory Networks and cell types from single-cell RNA-seq data.
http://scenic.aertslab.org
GNU General Public License v3.0
394 stars 94 forks source link

feather v1 or v2 for R package #334

Open zji90 opened 1 year ago

zji90 commented 1 year ago

Hi,

I have installed the latest version of R Scenic and it seems that it only works for feather v1 version but does not work for feather v2 version (for v2 there is error: "The index column 'features' is not available in the file."). I wonder if the R version hasn't been updated to work with feather v2 version or I am doing something wrong. Thanks!

cgraham13 commented 1 year ago

Hey,

I have been having the same issue as you with the features error. What did you do to get around that?

Here is my code:

hg38_dbs <- list('500bp'='hg38refseq-r80500bp_up_and_100bp_down_tss.mc9nr.genes_vs_motifs.rankings.feather', '10kb'='hg38refseq-r8010kb_up_and_down_tss.mc9nr.genes_vs_motifs.rankings.feather') db_path <- '/Users/carlygraham/Dropbox/BramsonLab/MoA/Scenic/dbDatabases/'

cellInfo <- data.frame(seuratCluster=Idents(full.integrated))

org <- "hgnc"

scenicOptions <- initializeScenic( org = org,

human

dbDir = db_path, dbs = hg38_dbs, datasetTitle = "ScenicAnalysis_1",

db_mcVersion = 'v9',

nCores=4 )

And this is the error I get:

Motif databases selected: hg38refseq-r80500bp_up_and_100bp_down_tss.mc9nr.genes_vs_motifs.rankings.feather hg38refseq-r8010kb_up_and_down_tss.mc9nr.genes_vs_motifs.rankings.feather [1] "The index column 'features' is not available in the file." [1] "The index column 'features' is not available in the file." Using the column 'A1BG' as feature index for the ranking database. Error in vec_as_location2_result(): ! Can't extract columns that don't exist. ✖ Column features doesn't exist. Run rlang::last_error() to see where the error occurred. Warning messages: 1: In initializeScenic(org = org, dbDir = db_path, dbs = hg38_dbs, : It was not possible to load the following databses; check whether they are downloaded correctly: hg38refseq-r80500bp_up_and_100bp_down_tss.mc9nr.genes_vs_motifs.rankings.feather hg38refseq-r8010kb_up_and_down_tss.mc9nr.genes_vs_motifs.rankings.feather 2: In RcisTarget::importRankings(dbFile, columns = rnktype) : The following columns are missing from the database: features

Thanks!

zji90 commented 1 year ago

You can use V1 version:

https://resources.aertslab.org/cistarget/databases/old/

darkcircle commented 1 year ago

I am about to find hg19-tss-centered-10kb-7species.mc9nr.feather from above old dir, but there is no exist. how can i get this?

gloriafight commented 1 year ago

I am about to find hg19-tss-centered-10kb-7species.mc9nr.feather from above old dir, but there is no exist. how can i get this?

I also want to download the "hg19-tss-centered-10kb-7species.mc9nr.feather" file. Do you solve the problem?

darkcircle commented 1 year ago

I am about to find hg19-tss-centered-10kb-7species.mc9nr.feather from above old dir, but there is no exist. how can i get this?

I also want to download the "hg19-tss-centered-10kb-7species.mc9nr.feather" file. Do you solve the problem?

Weirdly, this file is lost from old dir.

Please click here to download hg19-tss-centered-10kb-7species.mc9nr.feather: Enter code:6666

I received "Install the latest version of network disk client", and nothing things are occurred after click the button.

Well ... is this the best way to get a file? That link may just only for China.

gloriafight commented 1 year ago

I am about to find hg19-tss-centered-10kb-7species.mc9nr.feather from above old dir, but there is no exist. how can i get this?

I also want to download the "hg19-tss-centered-10kb-7species.mc9nr.feather" file. Do you solve the problem?

Weirdly, this file is lost from old dir. Please click here to download hg19-tss-centered-10kb-7species.mc9nr.feather: Enter code:6666

I received "Install the latest version of network disk client", and nothing things are occurred after click the button.

Well ... is this the best way to get a file? That link may just only for China.

You can now access the latest database resources. And modify the feather as follows. `db <- importRankings("./hg19-tss-centered-10kb-10species.mc9nr.genes_vs_motifs.rankings.feather", indexCol = "motifs") names(db@rankings)[1] <- "features" db@org <- "hgnc" db@genome <- "hg19" arrow::write_feather(db@rankings, "./hg19-tss-centered-10kb-10species.mc9nr.genes_vs_motifs.rankings.feather")

db <- importRankings("./hg19-500bp-upstream-10species.mc9nr.genes_vs_motifs.rankings.feather", indexCol = "motifs") names(db@rankings)[1] <- "features" db@org <- "hgnc" db@genome <- "hg19" arrow::write_feather(db@rankings, "./hg19-500bp-upstream-10species.mc9nr.genes_vs_motifs.rankings.feather")`

And then you can run the next step to keep genes. scenicOptions <- initializeScenic(org="hgnc", dbDir="./", dbs=db.hgnc, nCores=2) exprMat <- as.matrix(dta_harmony@assays$RNA@counts) genesKept <- geneFiltering(exprMat, scenicOptions, minCountsPerGene = 3 * 0.01 * ncol(exprMat), minSamples = ncol(exprMat) * 0.01) exprMat_filtered <- exprMat[genesKept, ] #Gene list saved in int/1.1_genesKept.Rds

However, when you use pyscenic, the original feather should be used.

Polligator commented 1 year ago

I did a research on this issue, the problem stems from the [checkAnnots] and [dbLoadingAttempt]function (https://rdrr.io/github/aertslab/SCENIC/man/ScenicOptions-class.html). checkAnnots has designated rnktype = "features", therefore, even if you have changed the dbIndexCol=''motifs", it is still not going to pass this check and fail. the alternative solution to pass, at least this step, is to change dbLoadingAttempt indexCol='motifs' and checkAnnots() function and make rnktype customizable parameter or make it as"motifs" to use the new database, the code actually already marks rnktype = "features" as TODO to make it customizable. You also need to set :dbs=defaultDbNames[["mgi"]] and dbIndexCol='motifs' in the initializeScenic() function.

jmvera255 commented 1 year ago

@joyduck I too identified dbIndexCol and rnktype as issues preventing use of the mm10 rankings and modified accordingly to set to motifs. I also modified motifAnnotName to incorporate "v9" for mc9nr rankings files in runSCENIC_2_createRegulons.R.

I can now run initializeScenic without issue using new mm10 rankings. I have no idea if additional issues will be encountered as I work through the next steps.

These changes are available in my forked SCENIC repo use_mm10 branch: https://github.com/jmvera255/SCENIC/tree/use_mm10

Polligator commented 1 year ago

It's been a while since I was trying this, my recollection was that I could run through all the steps without any issues after making those changes, but the repository might have changed since then. Good luck

jmvera255 commented 1 year ago

@ShaoZhiting24 I modified dbIndexCol several times inScenicOptions.R You can view the changes in the following commits: https://github.com/jmvera255/SCENIC/commit/999b779d6fcb90da8a4fa7af8415d1ab5c15e4c2 https://github.com/jmvera255/SCENIC/commit/58b61932f90c81c33fecce2ad25c72c5af3de785

and then when I run initializeScenic I set dbIndexCol = "motifs"

Polligator commented 1 year ago

For anyone who really wants to use the new database, it's probably a better investment of your time to run the latest pySCENIC, instead of trying to hack this in R. If the details given here do not enable you to solve the issue, you probably don't have a thorough understanding of how this operates, and it's unlikely that you require the updated database.

KOBE24DUNK commented 1 year ago

@joyduck I too identified dbIndexCol and rnktype as issues preventing use of the mm10 rankings and modified accordingly to set to motifs. I also modified motifAnnotName to incorporate "v9" for mc9nr rankings files in runSCENIC_2_createRegulons.R.

I can now run initializeScenic without issue using new mm10 rankings. I have no idea if additional issues will be encountered as I work through the next steps.

These changes are available in my forked SCENIC repo use_mm10 branch: https://github.com/jmvera255/SCENIC/tree/use_mm10

Thanks and sorry for repeatedly asking. My questions are solved now. If anyone else also encountered with empty regulons (after correctly following the details here), please just see: https://github.com/aertslab/pySCENIC/issues/177#issuecomment-698684416.

WietechaLab commented 9 months ago

I am about to find hg19-tss-centered-10kb-7species.mc9nr.feather from above old dir, but there is no exist. how can i get this?

I also want to download the "hg19-tss-centered-10kb-7species.mc9nr.feather" file. Do you solve the problem?

Weirdly, this file is lost from old dir. Please click here to download hg19-tss-centered-10kb-7species.mc9nr.feather: Enter code:6666

I received "Install the latest version of network disk client", and nothing things are occurred after click the button. Well ... is this the best way to get a file? That link may just only for China.

You can now access the latest database resources. And modify the feather as follows. `db <- importRankings("./hg19-tss-centered-10kb-10species.mc9nr.genes_vs_motifs.rankings.feather", indexCol = "motifs") names(db@rankings)[1] <- "features" db@org <- "hgnc" db@genome <- "hg19" arrow::write_feather(db@rankings, "./hg19-tss-centered-10kb-10species.mc9nr.genes_vs_motifs.rankings.feather")

db <- importRankings("./hg19-500bp-upstream-10species.mc9nr.genes_vs_motifs.rankings.feather", indexCol = "motifs") names(db@rankings)[1] <- "features" db@org <- "hgnc" db@genome <- "hg19" arrow::write_feather(db@rankings, "./hg19-500bp-upstream-10species.mc9nr.genes_vs_motifs.rankings.feather")`

And then you can run the next step to keep genes. scenicOptions <- initializeScenic(org="hgnc", dbDir="./", dbs=db.hgnc, nCores=2) exprMat <- as.matrix(dta_harmony@assays$RNA@counts) genesKept <- geneFiltering(exprMat, scenicOptions, minCountsPerGene = 3 * 0.01 * ncol(exprMat), minSamples = ncol(exprMat) * 0.01) exprMat_filtered <- exprMat[genesKept, ] #Gene list saved in int/1.1_genesKept.Rds

However, when you use pyscenic, the original feather should be used.

Thank you! This solved my issue for loading SCENIC in R.