aertslab / SCENIC

SCENIC is an R package to infer Gene Regulatory Networks and cell types from single-cell RNA-seq data.
http://scenic.aertslab.org
GNU General Public License v3.0
413 stars 94 forks source link

Feather file is old and R crushed #42

Closed PegasusAM closed 5 years ago

PegasusAM commented 6 years ago

Hi,

I'm trying to repeat the steps as directed. Everything's fine until library(RcisTarget) motifRankings <- importRankings(getDatabases(scenicOptions)[[1]]) The error showed "This Feather file is old and will not be readable beyond the 0.3.0 release" and then R crashed.

The feather files I used are image

My previous codes are: library(SCENIC) org="hgnc" # or hgnc, or dmel dbDir="databases" # RcisTarget databases location myDatasetTitle="human_data" # choose a name for your analysis scenicOptions <- initializeScenic(org=org, dbDir=dbDir, datasetTitle=myDatasetTitle, nCores=4)

It seems that the feather files are generated by an old version of feather package and is no longer readable by the newest version (0.3.1). I've checked feather github and they said no backwards compatibility. Any suggestions?

Thanks!

s-aibar commented 6 years ago

Hello,

That error usually happens when the databases are incomplete/corrupt (e.g. by a failed download).

We recommended to download the databases is using zsync_curl (https://resources.aertslab.org/cistarget/help.html). Once you have the files, make sure the sha256sum match the reported ones (https://resources.aertslab.org/cistarget/databases/sha256sum.txt). Also, you can check that you are using the latest R feather package (version 0.3.1, previous versions are more likely to crash due to this error).

Please, let me know if this helps!

PegasusAM commented 6 years ago

@s-aibar Thanks for your quick reply.

The weird thing is that the RcisTarget cannot recognize the database except the ones I downloaded directly using the R code. e.g: I have five feather files, the last two are downloaded using R code and were the only ones RcisTarget recognized but not matched with sha256sum (seems to be broken and could be the reason to induce the Feather read failure as you said ). image image

My R code for downloading: dbFiles <- c("https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc9nr/gene_based/hg19-500bp-upstream-7species.mc9nr.feather", "https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc9nr/gene_based/hg19-tss-centered-10kb-7species.mc9nr.feather") for(featherURL in dbFiles){ download.file(featherURL, destfile=basename(featherURL)) # saved in current dir descrURL <- gsub(".feather$", ".descr", featherURL) if(file.exists(descrURL)) download.file(descrURL, destfile=basename(descrURL)) }

s-aibar commented 6 years ago

What do you mean by "RcisTarget cannot recognize the database"? (Are you sure it is in the same directory, with the same name, etc...? )

w2niva commented 4 years ago

How was this issue resolved?

Alexxyz123 commented 3 years ago

This is a big question, we need help

davidroad commented 3 years ago

download the corresponding .feather file directly from their website https://resources.aertslab.org/cistarget/ works.

hemantgujar commented 1 year ago

Are these the wright files ?

https://resources.aertslab.org/cistarget/databases/mus_musculus/mm9/refseq_r45/mc9nr/gene_based/mm9-500bp-upstream-7species.mc9nr.genes_vs_motifs.rankings.feather; https://resources.aertslab.org/cistarget/databases/mus_musculus/mm9/refseq_r45/mc9nr/gene_based/mm9-tss-centered-10kb-7species.mc9nr.genes_vs_motifs.rankings.feather

my download was getting interrupted so i directly downloaded those files.

for(featherURL in dbFiles) {
   download.file(featherURL, destfile=basename(featherURL)) # saved in current dir
 }

trying URL 'https://resources.aertslab.org/cistarget/databases/mus_musculus/mm9/refseq_r45/mc9nr/gene_based/mm9-500bp-upstream-7species.mc9nr.genes_vs_motifs.rankings.feather' Content type 'unknown' length 1022531034 bytes (975.2 MB) downloaded 491.0 MB

Error in download.file(featherURL, destfile = basename(featherURL)) : download from 'https://resources.aertslab.org/cistarget/databases/mus_musculus/mm9/refseq_r45/mc9nr/gene_based/mm9-500bp-upstream-7species.mc9nr.genes_vs_motifs.rankings.feather' failed In addition: Warning messages: 1: In download.file(featherURL, destfile = basename(featherURL)) : downloaded length 514883584 != reported length 1022531034 2: In download.file(featherURL, destfile = basename(featherURL)) : URL 'https://resources.aertslab.org/cistarget/databases/mus_musculus/mm9/refseq_r45/mc9nr/gene_based/mm9-500bp-upstream-7species.mc9nr.genes_vs_motifs.rankings.feather': Timeout of 60 seconds was reached

However, I am getting an error in the next step.

scenicOptions <- initializeScenic(org = "mgi", dbDir="mm9_databases")

Motif databases selected: mm9-500bp-upstream-7species.mc9nr.feather mm9-tss-centered-10kb-7species.mc9nr.feather [1] "The index column 'features' is not available in the file." [1] "The index column 'features' is not available in the file." Error in eval(as.name(motifAnnotName)) : object 'motifAnnotations_mgi' not found In addition: Warning message: In initializeScenic(org = "mgi", dbDir = "mm9_databases") : It was not possible to load the following databses; check whether they are downloaded correctly: mm9-500bp-upstream-7species.mc9nr.feather mm9-tss-centered-10kb-7species.mc9nr.feather

Can anyone please help. Thanks.

Jane-96-45 commented 7 months ago

@hemantgujar

I have the same problem with you. I think there might be something different between the old and new database. I found the old version of motif databases which they used in their tutorial, and they work well with the workflow.

dbFiles <- c("https://resources.aertslab.org/cistarget/databases/old/mus_musculus/mm9/refseq_r45/mc9nr/gene_based/mm9-500bp-upstream-7species.mc9nr.feather", "https://resources.aertslab.org/cistarget/databases/old/mus_musculus/mm9/refseq_r45/mc9nr/gene_based/mm9-tss-centered-10kb-7species.mc9nr.feather") for(featherURL in dbFiles) { download.file(featherURL, destfile=basename(featherURL)) # saved in current dir }

I still didn't figure out how to use their new motif databases: mm9-500bp-upstream-7species.mc9nr.genes_vs_motifs.rankings.feather; mm9-tss-centered-10kb-7species.mc9nr.genes_vs_motifs.rankings.feather

Does anyone know?