Bioconductor / AnnotationForge

Tools for building SQLite-based annotation data packages
https://bioconductor.org/packages/AnnotationForge
4 stars 9 forks source link

Populating genes table: Erreur : database is locked #31

Closed TSolDour closed 10 months ago

TSolDour commented 2 years ago

Hi,

I'm trying to build a custom org.db object for Crassostrea gigas and this is what R returns when I run MakeOrgPackage : "Populating genes table: Erreur : database is locked De plus : Warning messages: 1: In file.remove(dbFileName) : impossible d'effacer le fichier './org.Cgigas.eg.sqlite', à cause de 'Permission denied' 2: In result_fetch(res@ptr, n = n) : SQL statements must be issued with dbExecute() or dbSendStatement() instead of dbGetQuery() or dbSendQuery(). 3: In result_fetch(res@ptr, n = n) : SQL statements must be issued with dbExecute() or dbSendStatement() instead of dbGetQuery() or dbSendQuery(). 4: In result_fetch(res@ptr, n = n) : SQL statements must be issued with dbExecute() or dbSendStatement() instead of dbGetQuery() or dbSendQuery(). 5: In result_fetch(res@ptr, n = n) : SQL statements must be issued with dbExecute() or dbSendStatement() instead of dbGetQuery() or dbSendQuery()."

I don't know why database is locked ? Do I have to do someting to unlock it ? Moreover, I understand that './org.Cgigas.eg.sqlite' already exists, but it is outdated and unusuable for me.

Here, the script i'm using makeOrgPackage(gene_info=Cg_Sym, go=Cg_GO, version="0.1", maintainer="Thomas Sol Dourdin thomas.sol.dourdin@ifremer.fr", author="Thomas Sol Dourdin thomas.sol.dourdin@ifremer.fr", outputDir = ".", tax_id = "29159", genus = "Crassostrea", species = "gigas", goTable="go")

I hope you'll have a clue to solve my issue !

Best regards, Thomas.

lshep commented 2 years ago

So yes the data does already exist in the AnnotationHub generated on 10-13-2021

> library(AnnotationHub)
> ah = AnnotationHub()
snapshotDate(): 2022-02-22
> query(ah, c("orgDb", "Crassostrea"))
AnnotationHub with 2 records
# snapshotDate(): 2022-02-22
# $dataprovider: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/
# $species: Crassostrea virginica, Crassostrea gigas
# $rdataclass: OrgDb
# additional mcols(): taxonomyid, genome, description,
#   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
#   rdatapath, sourceurl, sourcetype 
# retrieve records with, e.g., 'object[["AH96120"]]' 

            title                              
  AH96120 | org.Crassostrea_virginica.eg.sqlite
  AH96137 | org.Crassostrea_gigas.eg.sqlite    
> ah["AH96137"]
AnnotationHub with 1 record
# snapshotDate(): 2022-02-22
# names(): AH96137
# $dataprovider: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/
# $species: Crassostrea gigas
# $rdataclass: OrgDb
# $rdatadateadded: 2021-10-13
# $title: org.Crassostrea_gigas.eg.sqlite
# $description: NCBI gene ID based annotations about Crassostrea gigas
# $taxonomyid: 29159
# $genome: NCBI genomes
# $sourcetype: NCBI/UniProt
# $sourceurl: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/, ftp://ftp.uniprot.org/p...
# $sourcesize: NA
# $tags: c("NCBI", "Gene", "Annotation") 
# retrieve record with 'object[["AH96137"]]' 

And it should be noted that we will be regenerating this file in the next 4 weeks or so in preparation for the next Bioconductor release in end of April.

When we generate for the hub we use the makeOrgPackageFromNCBI and when I tried to run this this morning it worked (It does take quite awhile to run -- a few hours -- and it could need up to 60 G free of space to download files)

> makeOrgPackageFromNCBI(version = "0.1",author = "Some One <so@someplace.org>",
                            maintainer = "Some One <so@someplace.org>",
                            outputDir = ".",
                            tax_id = "29159",  genus = "Crassostrea",
                             species = "gigas")

If updated object and provided using provided data is needed immediately can you verify how you made the files Cg_Sym and Cg_GO were made and the sessionInfo() for your R/Bioc session.

TSolDour commented 2 years ago

Hi, thanks for your answer !

Actually, I succeeded in creating an object by giving a false tax_ID and genus/species.

However, this object is not complete because contains only "MF" GOs and when I try to recreate it is a complete GO db it doesn't work anymore ! MakeOrgPackage alternativaly returns two errors :

Maybe I could send you the two files in order to check if their formatting is good ? Or, if it's easier, could you show an exemple of such files ?

Best, Thomas

lshep commented 2 years ago

Yes could you please send the files to lori.shepherd@roswellpark.org and I will try to investigate more. Sorry for the inconvenience.

jmacdon commented 10 months ago

Please don't post questions about how to use a package here. This is supposed to be for issues/bugs. You are simply not using the function correctly. In future, post questions on support.bioconductor.org.

For now, note that your oSym.tsv file has only one column. Presumably you wanted the second column to contain symbols?