Closed VilainLab closed 5 years ago
Received a valid push; starting a build. Commits are:
ab9172a Bumped version
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
Hi,
Currently my package has passed all tests, and there is one warning I am getting. For the functions which use the R package rentrez to extract entrez ids related to a given phenotype from NCBI and convert the gene IDs to gene symbols using biomaRt, is not passing the 5 minute processing time criterion. Is there a way (any other alternative) by which I can reduce this processing time? Thanks a lot for all the help and guidance in advance
Thanks, Surajit
Bioconductor 'org' packages allow mapping between ids and symobls, e.g.,
ids = c("ENSG00000121410", "ENSG00000175899", "ENSG00000256069", "ENSG00000171428",
"ENSG00000156006", "ENSG00000196136")
mapIds(org.Hs.eg.db, ids, "SYMBOL", "ENSEMBL")
I'm not exactly sure what you mean by 'entrez ids related to a given phenotype'
Hi Martin,
Thanks a lot for your reply. Actually the function has two parts i) Extract entrez gene ids related to a user defined term (disease/phenotype) from NCBI databases like ClinVar, OMIM, etc, using rentrez. ii) Convert the gene ids to Gene symbol using BioMart.
Both these methods combined are increasing the processing time of the package, as they are fetching data from external database. Most time consuming part of the whole process is to get the ensembl object using useMart from biomaRt. As I am printing both ensembl ID and entrez Gene as an output, do you think I can use any other tool to do the same? I tried using org.Hs.eg.db, fetching gene symbols and ensembl ids seperately, but the time difference is marginal.
Thanks again for all your guidance and help.
Thanks, Surajit
From: Martin Morgan [notifications@github.com] Sent: Friday, April 05, 2019 8:17 AM To: Bioconductor/Contributions Cc: Bhattacharya, Surajit; Mention Subject: [EXT] Re: [Bioconductor/Contributions] nanotatoR: next generation structural variant annotation and classification (#913)
ATTENTION: External Email! Do not click attachments/links unless sender is known.
Bioconductor 'org' packages allow mapping between ids and symobls, e.g.,
ids = c("ENSG00000121410", "ENSG00000175899", "ENSG00000256069", "ENSG00000171428", "ENSG00000156006", "ENSG00000196136") mapIds(org.Hs.eg.db, ids, "SYMBOL", "ENSEMBL")
I'm not exactly sure what you mean by 'entrez ids related to a given phenotype'
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_Bioconductor_Contributions_issues_913-23issuecomment-2D480254349&d=DwMFaQ&c=Zoipt4Nmcnjorr_6TBHi1A&r=A7ikyxJ6NjfZXZkHJSAEEDOxGh2fq1RXLJXrVyw-TI4&m=PH2z3KGO7NmhU_9Z0Umk2nEAu7delPcdOuCEz4HKgXQ&s=Ndhg5A6u2QUmThVGf5SVVzX6M3nI5XWXZgmbns3xS2A&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AlqeLih8nmIf6jLJoz2ptxfZGLljKueAks5vdz7UgaJpZM4XLHWQ&d=DwMFaQ&c=Zoipt4Nmcnjorr_6TBHi1A&r=A7ikyxJ6NjfZXZkHJSAEEDOxGh2fq1RXLJXrVyw-TI4&m=PH2z3KGO7NmhU_9Z0Umk2nEAu7delPcdOuCEz4HKgXQ&s=upXZYJSyQmXiEoyV5D-o3yHQqI6Sd0il8pzf2s89GWk&e=.
I'm not really sure what you mean by the system time being marginal; mapIds()
will generally return in under a second
> system.time(mapIds(org.Hs.eg.db, ids, "SYMBOL", "ENSEMBL"))
'select()' returned 1:1 mapping between keys and columns
user system elapsed
0.110 0.001 0.111
whereas biomaRt is a web call and must be much slower than that?
I think the conversation would be easier if you were to provide some simple reproducible examples that I can copy and paste into my own R session.
Received a valid push; starting a build. Commits are:
f5d2e46 Bumped version
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details.
Hello,
I could get rid of the error by using mapIds function. Thanks a lot for your advise regarding the same. somehow it was taking some time to run in my system, but when I restarted my computer, it ran properly. Thanks again for your advise.
Thanks, Surajit
Hello, Replies in line in bold inline.
Please let me know, if you have any other query.
Thanks again for all the help and advise.
Thanks, Surajit
Thank you for your submission to Bioconductor. Please see the following initial review.
General
This biggest concern is your package should be compatible with other existing Bioconductor methods and objects. You currently don't import or use any existing Bioconductor infastructure. Example: Why for reading bed files did you not use the recommended rtracklayer::import ? You should be able to take in output of standard Bioconductor classes to your functions (like GenomicRanges the output of rtracklayer::import and the recommended reading of BED files). We also would consider GSEABase for GeneSet and GeneSetCollections for gene results.: The bed file input functionality we have currently is reading bed files as text format and converting the X and Y chromosomes to 23 and 24 respectively. This is done in order for this to be compatible to the variant file created by the optical mapping bionano platform. The user has the choice to enter either a bionano compatible bed file or input a normal bed file that can be converted into bionano compatible bed file. From the next version of nanotatoR. we would use rtracklayer::import to import bed files. Other than this functionality we have used bioconductor functions where ever we got the chance, like using AnnotationDbi and org.Hs.eg.db, for converting entrez id to gene symbols Build Report
correct the note about coding practices regarding the use of 1:... Done runnable examples - you should at least show the code that could be run in nanotatoR_main.Rd. run_bionano_filter example doesn't call itself in the example code? We have shown the code, but kept it un-executable as it provides the output as a excel file which has Rtools dependencies. NEWS - see below NEWS
The reason the build report is not picking up your NEWS file is the bad file extension. Your NEWS file is also formatted incorrectly. The news file should either be NEWS or NEWS.md and should be formatted as directed in the ?news helper. Remove the title section and format accordingly. Formatted in the format required DESCRIPTION
remove lazyData: true Done Vignette
Change installation instruction to have Bioconductor installation. Done
In your code chunks be consistent with argument settings. you defined smap and then system file with directly accessing for hgpath. Then you define some arguments to use in your function but then don't pass them as arguments to your function.Done
why are you providing your own bed file. use existing in other packages or from the bioconductor annotation or experiment hubs. We would implement this in the next iterations
Correct the following:
2: In if (returnMethod_bedcomp == "Text") { :
the condition has length > 1 and only the first element will be used. Fix this
conditional to account for length > 1.
The results of running the code - datcomp is NULL
Correct the following. You should not be using this argument if it is designated as deprecated.
Warning in biomaRt::useMart("ensembl", host = "www.ensembl.org",
ensemblRedirect = FALSE, : The argument "ensemblRedirect" has been
deprecated and will be removed in the next biomaRt release.
Using org.Hs.eg.db instead
run_bionano_filter the code given cannot be run and results in ERROR - fix this run_bionano_filter(SVFile=smappath,fileName,input_fmt_geneList="dataFrame",
input_fmt_svMap="Text",RtoolsZIPpath="")
Error in run_bionano_filter(SVFile = smappath, fileName, input_fmt_geneList = "dataFrame", :
unused argument (RtoolsZIPpath = "")
error running nanotatoR_main code chunk as well
nanotatoR_main(smap, bed, inputfmtBed = c("BNBED"),
n=3,mergedFiles , buildSVInternalDB=TRUE, soloPath, solopattern,
input_fmt_INF=c("dataframe"),returnMethod_GeneList=c("dataframe"),
returnMethod_bedcomp=c("dataframe"),returnMethod_DGV=c("dataframe"),
returnMethod_Internal=c("dataframe"),input_fmt_DGV=c("dataframe"),
hgpath, smapName,method=c("Single"), term, thresh=5,
input_fmt_geneList=c("dataframe"),input_fmt_svMap=c("dataframe"),
svData,dat_geneList,outpath="",outputFilename="",RZIPpath="")
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
no lines available in input
In addition: Warning message:
In file(file, "rt") :
file("") only supports open = "w+" and open = "w+b": using the former
Took care of this errors
Inst
Cannot open file in doc/ remove or update Removed it Document how you created or the source information for the files provided in inst/extdata in inst/scripts The data sets used are sample example data made from real bionano optical mapping datasets, which are downloaded from Genome In a bottle Consortium.Can you please advise us the best way to cite those? Are all thees files standard output from a particular type of experiment or instrumentation? nanotatoR is used to annotate Structural variant maps (smaps) which are produced by the optical mapping platforms from Bionano. This optical mapping tool are effective in better understanding and examining larger Structural variants. Although, the analysis software produced by bionano is effective in calling variants, but they do not provide any annotation. Our tool bridges this gap. Why not use already existing bed files provided in already accepted Bioconductor instead of providing new. This is always encouraged. Would add This functionality in the next iteration R code Have not done an indepth look into the R code until the section General is considered. There should be a re-write making the functions compatible and consistent with existing Bioconductor infastructure. See also: http://bioconductor.org/developers/how-to/commonMethodsAndClasses/
Please comment back here with changes and updates. Let me know when you would like me to re-review your package. Cheers
Can you please edit your response so that it is easy to read? bold, for instance, doesn't appear in several places. thanks
Edited and and made all the comments bold.
Thanks, Surajit
Please address this note for all simple uses of the form 1:n
.
* NOTE: Avoid 1:...; use seq_len() or seq_along()
Found in files:
Bed_SV_Comp.r (line 370, column 41)
Bed_SV_Comp.r (line 371, column 45)
Bed_SV_Comp.r (line 372, column 43)
Bed_SV_Comp.r (line 375, column 23)
Bed_SV_Comp.r (line 391, column 37)
Bed_SV_Comp.r (line 392, column 41)
Bed_SV_Comp.r (line 393, column 39)
Bed_SV_Comp.r (line 396, column 23)
Bed_SV_Comp.r (line 416, column 37)
Bed_SV_Comp.r (line 417, column 41)
Bed_SV_Comp.r (line 418, column 39)
Bed_SV_Comp.r (line 421, column 23)
Bed_SV_Comp.r (line 443, column 33)
Bed_SV_Comp.r (line 444, column 37)
Bed_SV_Comp.r (line 445, column 35)
Bed_SV_Comp.r (line 448, column 19)
entrez_extract.r (line 53, column 37)
entrez_extract.r (line 103, column 15)
Please use rtracklayer to import your bed files, rather than writing your own parser.
For this
Document how you created or the source information for the files provided in inst/extdata in inst/scripts The data sets used are sample example data made from real bionano optical mapping datasets, which are downloaded from Genome In a bottle Consortium.Can you please advise us the best way to cite those?"
Write a 'man' page describing the steps required to generate the data; maybe these are 'standard' outputs from some software, or perhaps the output generated when specific options are chosen, or perhaps you have had to process the data generated by the software. All of this should be documented on the appropriate man page.
Please in your response make it clear for me to see what changes you made. For instance, use bullet points or white space to clearly separate each response.
Received a valid push; starting a build. Commits are:
2309527 Bumped Version
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
Received a valid push; starting a build. Commits are:
9f1ddbf Version Bumped
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
Received a valid push; starting a build. Commits are:
5bcfeec Version Bump
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
Received a valid push; starting a build. Commits are:
c3a6c0c Update DESCRIPTION
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details.
Received a valid push; starting a build. Commits are:
8fe1036 Version Bump
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
Received a valid push; starting a build. Commits are:
e33bb8b Deleted errors and version bump
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details.
Hi, Thanks for your response. Please find below my reply regarding the review.
All the instances of 1:n has been replace by seq(n) or seq_len(length(n)).
I have used rtracklayer import to read in data.
I have put a Rscript detailing the different files, where they were downloaded from, how they were processed and defining each column. its in the inst/script/ folder.
Please let me know, if you have any further issues/comments.
Thanks again for all your help.
Thanks, Surajit
OK thanks. seq_len(length(x))
is usually just seq_along(x)
Your package has been accepted. It will be added to the Bioconductor Git repository and nightly builds. Additional information will be posed to this issue in the next several days.
Thank you for contributing to Bioconductor!
The master branch of your GitHub repository has been added to Bioconductor's git repository.
To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/VilainLab.keys is not empty), then no further steps are required. Otherwise, do the following:
See further instructions at
https://bioconductor.org/developers/how-to/git/
for working with this repository. See especially
https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/
to keep your GitHub and Bioconductor repositories in sync.
Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at
https://bioconductor.org/checkResults/
(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using BiocManager::install("nanotatoR")
. The package 'landing page' will be created at
https://bioconductor.org/packages/nanotatoR
If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.
Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor
Confirm the following by editing each check box to '[x]'
[x ] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
[x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
[x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.
[x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.
[x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.
I am familiar with the essential aspects of Bioconductor software management, including:
For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.