Closed lshep closed 6 months ago
@lshep I'll take a look.
@jmacdon Thanks -- worth noting I could be wrong that it is related to that commit but just thinking that writing each time might be the bottle neck slow down/where I think we thought that if it was using the same data it wouldn't need to write anything new and just use existing data and only need to write if being updated. But again - it could be in a different place too.
@lshep What is the exact call you are using to build the OrgDb
?
I'm using the receipe in AnnotationHub that calls this function underneath
meta <- updateResources("NonStandardOrgDb",BiocVersion = "3.17", preparerClasses = "NCBIImportPreparer",metadataOnly = FALSE, insert = FALSE, justRunUnitTest = FALSE)
Backed out changes to .downloadData
so using rebuildCache = FALSE will now use existing NCBI.sqlite Db to build the OrgDb
instead of rebuilding the SQLite Db first.
When I say significantly slower I mean it would rebuild the cache once a day that took a few hours but we had optimized so that when building multiple in a row, subsequent calls would take a few seconds (minutes at most) now it takes hours for each again. I am trying to still build 3.17 non standard org db to put into AnnotationHub which requires building ~1900 right now. This used to take me 3 days -- its going on 6 weeks or more!!! My suspicious is it has to do with this commit https://github.com/Bioconductor/AnnotationForge/commit/27b4772bb164ed40269b9a770e4bfdd722fdbbc8 but if I move it back then in local testing I see the previously reported
which also never used to occur.
Any advice is appreciated @jmacdon