Bioconductor / Contributions

Contribute Packages to Bioconductor
135 stars 33 forks source link

IsoformSwitchAnalyzeR #349

Closed kvittingseerup closed 7 years ago

kvittingseerup commented 7 years ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.

bioc-issue-bot commented 7 years ago

Hi @kvittingseerup

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: IsoformSwitchAnalyzeR
Type: Package
Title: An R package to Identify, Annoatate and Visialize Isoform Switches with Functional Consequences (from RNA-seq data).
Version: 0.99.0
Author: Kristoffer Vitting-Seerup
Maintainer: Kristoffer Vitting-Seerup <k.vitting.seerup@gmail.com>
Description: Since 2010 state-of-the-art bioinformatics tools have allowed researchers to reconstruct and quantify full length transcripts from RNA-seq data. Such geneome wide isoform resolution data has the potential to facilitate both genome wide analysis of alternative isoform usage and identification of isoform switching. Unfortunatly these types of analysis are still only rarely done and/or reported - in fact only 11% of articles analyzing RNA-seq data publised start 2016 performed analy isoform analysis. We hypothesis that there are 3 reasons why RNA-seq data is not used to its full potential: 1) There is still a lack of tools that can identify isoform switches with isoform resolution - thereby identifying the exact isoforms involved in a switch. 2) Although there are many very good tools to perform sequence analysis there is no common framework, which allows for integration of the analysis provided by these tools. 3) There is a lack of tools facilitating easy and article ready visual visualization of isoform switches. To all 3 problems we developed IsoformSwitchAnalyzeR. IsoformSwitchAnalyzeR is an easy to use R package that enableswhich enables statistical identification as well as visualization of isoform switches with predicted functional consequences from RNA-seq data
License: GPL (>= 2)
Depends: R (>= 3.1.1), methods, plyr, reshape2, gridExtra, Biostrings, IRanges, GenomicRanges, BSgenome, RColorBrewer, rtracklayer, cummeRbund, ballgown, ggplot2, VennDiagram, spliceR
Imports: plyr, reshape2, gridExtra, Biostrings, IRanges, GenomicRanges, RColorBrewer, ggplot2
Suggests: knitr, BSgenome, BSgenome.Hsapiens.UCSC.hg19
VignetteBuilder: knitr
Collate: classes.R import_data.R test_isoform_switches.R analyze_ORF.R analyze_external_sequence_analysis.R intron_retention.R analyze_switch_consequences.R isoform_plots.R plot_all_iso_switch.R high_level_functions.R tools.R extractGenomeWideAnalysis.R
biocViews: GeneExpression, Transcription, AlternativeSplicing, DifferentialExpression, DifferentialSplicing, Sequencing, Visualization, StatisticalMethod, TranscriptomeVariant, BiomedicalInformatics, FunctionalGenomics, SystemsBiology, Transcriptomics, RNASeq, Annotation, FunctionalPrediction, GenePrediction, DataImport
bioc-issue-bot commented 7 years ago

Your package has been approved for building. Your package is now submitted to our queue.

IMPORTANT: Please read the instructions for setting up a push hook on your repository, or further changes to your repository will NOT trigger a new build.

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/IsoformSwitchAnalyzeR_buildreport_20170418120309.html

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

2578581 Updated vignette with relative (instead of absolut... 00ce6dc Updated version number

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR, WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/IsoformSwitchAnalyzeR_buildreport_20170419041927.html

lawremi commented 7 years ago

What about having an object that contains all of the pipeline parameters, like ScanBamParam in Rsamtools? Then the user could construct that object and pass it to both part 1 and part 2, with less chance of inconsistencies. It might also help with passing the parameters down to the lower level routines.

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

7d55ede Update of documentation enabling pdf versions

kvittingseerup commented 7 years ago

@lawremi I agree that it might be useful if a user want to use non-default parameters although it is really only the "alpha" and "dIFcutoff" parameters occur in many functions and where such a parameter object would be useful. However I think a large fraction of the potential users are not necessarily used to R so "hiding" parameter choices outside the function documentation might generate more harm/confusing than good...?

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR, WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/IsoformSwitchAnalyzeR_buildreport_20170420085200.html

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

bb32a7d Updated namespace

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/IsoformSwitchAnalyzeR_buildreport_20170420090757.html

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

408d18c Fix namespace issue try #2

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/IsoformSwitchAnalyzeR_buildreport_20170420120746.html

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

3baa8d9 version bump

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/IsoformSwitchAnalyzeR_buildreport_20170420143704.html

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

68793a2 Added a class description and updated R version de...

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/IsoformSwitchAnalyzeR_buildreport_20170421060205.html

vobencha commented 7 years ago

To address your comment posted on bioc-devel, these warnings are issued because packages you depend on have defined functions with the same name. This is not your problem to fix.

Before I do a full review, please look at the packages in Depends and determine if you really need to attach all of these to the search path. Can any be moved to Imports?

Also address these issues:

Undefined global functions or variables: CI_down CI_hi CI_low CI_up Condition Domain IF bg dbGetQuery dev.off dnorm geneFraction gene_expression grid.layout grid.newpage hcl idNr importCufflinksData isoFraction isoform_feature lines na.omit nrGenesWithConsequences nrIsoWithConsequences optim p.adjust pchisq pdf plot png pnorm pt pushViewport qnorm qt queryHits read.table seqlevels seqlevels<- setNames setTxtProgressBar sigEval sigLevel sigLevelPos significance switchConsequence text title txtProgressBar value variable viewport weighted.mean wilcox.test x xmax xmin y yend ymax ymin Consider adding importFrom("grDevices", "dev.off", "hcl", "pdf", "png") importFrom("graphics", "lines", "plot", "text", "title") importFrom("stats", "dnorm", "na.omit", "optim", "p.adjust", "pchisq", "pnorm", "pt", "qnorm", "qt", "setNames", "weighted.mean", "wilcox.test") importFrom("utils", "read.table", "setTxtProgressBar", "txtProgressBar") to your NAMESPACE file.

Valerie

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

8d86f05 Update to reduce the number of Depends packages

kvittingseerup commented 7 years ago

Hi Valerie ( @vobencha ) Thanks for taking time to review my package as well as the suggestions - I really appreciate it!

I have now updated the "Dependes" and the "Namespace" as well as I (think) I can. I still have to "Depend" on two packages: 1) SpliceR - which internally uses the mcols() function that cannot be used unless the GenomicRanges packages is attached. 2) cummeRbund - which internally use the "dbDriver" function ( cummeRbund::readCufflinks() line 70 ) which cannot be accessed unless cummeRbund is attached.

My guess is that these "errors" occure because the packages themselfs "Depends" on other packages.

But else I have moved the of the dependencies to Imports. I have furthermore updated the namespace with (an extended) version of your suggestions. The rest of the Notes about undefined variables arises when I call directly on varaibles in ggplot (e.g. ggplot( aes(x=myX)) adds 'myX' to the list of undefined variables).

Kristoffer

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/IsoformSwitchAnalyzeR_buildreport_20170424112245.html

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

f3c7c40 Another namespace vs imports update

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/IsoformSwitchAnalyzeR_buildreport_20170425051944.html

kvittingseerup commented 7 years ago

Hi Valerie ( @vobencha )

Now IsoformSwitchAnalyzeR is back to only give the warnings due to the loading of other packages. Looking forward to hear from you (although I naturally realise now is a busy time with the new BioC release).

kvittingseerup commented 7 years ago

Dear Valerie ( @vobencha )

I was simply wondering whether you had any idea when you might get around to review my package? :-)

Kindest regards /Kristoffer

vobencha commented 7 years ago

Hi, Sorry for the delay - too many things going on. I'll take a look at this today. Valerie

vobencha commented 7 years ago

Actually this was the issue that disappeared for some time. I came back to reivew it and the issue and repo were gone. Not sure what happened there - but yes, I'll review it today.

vobencha commented 7 years ago

Hi,

I've taken a look at IsoformAnalyzeR. Please see suggestions below.

(1) Remove require(methods) from classes.R. Move the methods package from 'Imports' to 'Depends' in DESCRIPTION.

(2) Truncate code files to max of 80 characters wide. It's very hard to read and follow with so much wrapping.

(3) Instead of writing a subsetSwitchAnalyzeRlist() function why not implement single '[' or double '[[' bracket subsetting on the object?

(4) classes.R has much more than just class definitions and is difficult to navigate. The standard approach is to either put just class definitions in one file and functions/helpers go in separate files or create a file for each class where the class definition is with the functions/helpers.

(5) Input classes

I strongly suggest you use SummarizedExperiment or RangedSummarizedExperiment as input and remove switchAnalyzeRlist.

It doesn't make sense to make a formal class for something that is already an existing class. For example,

setClass("switchAnalyzeRlist",
         representation("list")
)

switchAnalyzeRlist is exactly a list; it doesn't have any additional slots, no validity, nothing that distinguishes it from a regular list. If there are requirements that this list have elements named

isoformFeatures
exons
conditions
sourceId

then you need to write a validity method that checks this. Validity should be invoked when the object is created or modified. The validity should check name requirements as well as data type. Can 'exons' be integers or character or ? S4 classes should also have getters and setters etc.

(6) CDSSet class

I don't see why it's necessary to create a new class here. CDSSet has the same information as a GRanges class. There is also a coersion method to convert the GRanges to a character vector with as.character().

It looks like you're reading data from a GTF file. Is there a reason why you aren't you using rtracklayer::import() or GenomicRanges::makeGRangesFromGFF()?

(7) Please explain the value of SpliceRList class.

(8) Please remove all .pdf and .html files from vignettes/:

~/repos/git/software/IsoformSwitchAnalyzeR/vignettes >ls
IsoformSwitchAnalyzeR.html  IsoformSwitchAnalyzeR.Rmd     Overview_figure_simple.pdf
IsoformSwitchAnalyzeR.R     Overview_figure_detailed.pdf  statistics_illustration.pdf

Only the .Rmd file should be there.

This review will take several passes. Please make the changes suggested above and I'll take another look. Let me know if you have questions.

Thanks. Valerie

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

9ce467c Major update introducing switch identification via...

kvittingseerup commented 7 years ago

Dear Valerie (@vobencha)

Thanks for taking the time to review my package. I have followed all your suggestions with a few exceptions which I try to my reasoning below below.

Please note that I have performed a major update on the R package to integrate with the DRIMSeq package which have also changed dependencies and namespace a bit.

Comment (1) Remove require(methods) from classes.R. Move the methods package from 'Imports' to 'Depends' in DESCRIPTION. Response : Done.

Comment (2) Truncate code files to max of 80 characters wide. It's very hard to read and follow with so much wrapping. Response: I have split all code into lines max 80 character wide with the exception of places where it was not possible, text strings (such as warnings) and comments.

Your comment (3) and (5): (3) Instead of writing a subsetSwitchAnalyzeRlist() function why not implement single '[' or double '[[' bracket subsetting on the object? (5) I strongly suggest you use SummarizedExperiment or RangedSummarizedExperiment as input and remove switchAnalyzeRlist.

Response: The subsetSwitchAnalyzeRlist() enables removeal of specific isoforms and their assocaited data acoress all entries in the switchAnalyzeRlist (removing specific rows in each entry instead of the whole entry). Do you think this is a misuse of the subset() method?

Unfortunatly the SummarizedExperiment is not really well suited the entries in the switchAnalyzeRobject as they have very different lengths and very different layouts: The 'isoformFeatures' entry have 1 row per isoform pr comparison. The orfAnalysis entry have 1 row per isoform. The switchConsequence have 1 row per comparison of isoform per consequence analyzed. As the enteries are very differen (just like edegR, etc) they will not fit into a SummarizedExperiment matrix like framework. That is why a list type format was used.

The reason why I created the switchAnalyzeRlist object was actually mostly to implement the subset() method described above and to avoide interfering with the normal subset method of a standard list.

What do you think the appropriate way of doing this is? Should I discard the 'switchAnalyzeRlist' class, build a seperate validity method and introduce the subsetSwitchAnalyzeRlist() function as a function instead of as the subset() method?

Commment (4) classes.R has much more than just class definitions and is difficult to navigate Response: classes.R have now been split into classes.R and methods.R

Your comment (6) and Comment (7) : CDSSet class, Please explain the value of SpliceRList class. Response: I am sorry these were leftover classes I copied from the bioconductor "spliceR" package at some point due to errors I got during the automatic checks. They have now been removed.

Your comment (8) Please remove all .pdf and .html files from vignettes/: Response: I have removed the file generated when I test the vignette but the files "Overview_figure_simple.pdf", "Overview_figure_detailed.pdf" and "statistics_illustration.pdf" are nessesary as they are incooperated into the vignette when it is created (and needs to be in the same folder as the .Rmd file)

Your (unnumbered) comment : It looks like you're reading data from a GTF file. Is there a reason why you aren't you using rtracklayer::import() or GenomicRanges::makeGRangesFromGFF()? Response: I am using rtracklayer::import(). The importGTF() function is just a wrappper for the rtracklayer::import() function that adds additional functionality (interpreting CDS and prediction of pre-mature termination codons) and outputs the data as a switchAnalyzeRlist object.

/Kristoffer

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/IsoformSwitchAnalyzeR_buildreport_20170519124028.html

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

d0a71dc Fixes Bioconductor review #1 and other small stuff... f53257c Ammendment to last update

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR, WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/IsoformSwitchAnalyzeR_buildreport_20170524054328.html

kvittingseerup commented 7 years ago

Dear Valerie (@vobencha)

The only warning/error I get is due to problems with rtracklayer on tokay1. The error probably due to that rtracklayer itslef have an build error on tokay1 ( https://www.bioconductor.org/checkResults/3.6/bioc-LATEST/rtracklayer/tokay1-checksrc.html ).

Do we need to wait for that to be fixed before the review can continue?

Kindest Regards Kristoffer

vobencha commented 7 years ago

Thanks for making the changes:

Re: Your Response: The subsetSwitchAnalyzeRlist() enables removeal of specific isoforms and their assocaited data acoress all entries in the switchAnalyzeRlist (removing specific rows in each entry instead of the whole entry). Do you think this is a misuse of the subset() method?

A more standard approach would be to provide a functions that identifies isoforms to be removed and then remove them with ']' or ']]'.

Re: Your Response I have removed the file generated when I test the vignette but the files "Overview_figure_simple.pdf", "Overview_figure_detailed.pdf" and "statistics_illustration.pdf" are nessesary as they are incooperated into the vignette when it is created (and needs to be in the same folder as the .Rmd file)

Why not use jpeg or png files?

Valerie

kvittingseerup commented 7 years ago

Dear Valerie (@vobencha)

Thanks for the feedback. It is a very good idea to switch to jpg’s – I am just so used to working with pdfs I did not even think about that. I will replace the pdfs with PNG/JPGs in the next update.

With regards to subsetting I think the use of “[“ or “[[“ might be a bit misleading as it performs a different function for a list-type object – in hindsight I think it is the same with the subset() method.

How about I remove the subset() method and just instruct people to use the subsetSwitchAnalyzeRlist() function instead? The function could also be renamed to reduceSwitchAnalyzeRlist() or something similar)?

/Kristoffer

vobencha commented 7 years ago

Yes, I think removing subset() is fine; keep or rename subsetSwitchAnalyzeRlist(), whichever you prefer. Valerie

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

fbe8b18 Update for fitting into Bioconductor and more - se...

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/IsoformSwitchAnalyzeR_buildreport_20170602093008.html

kvittingseerup commented 7 years ago

Dear Valerie (@vobencha)

I have now made the suggested updates as well as some small other fixes.

/Kristoffer

vobencha commented 7 years ago

Great. Thanks for your work on the package. Marking as approved. Valerie

bioc-issue-bot commented 7 years ago

Your package has been accepted. It will be added to the Bioconductor svn repository and nightly builds. Additional information will be sent to the maintainer email address in the next several days.

Thank you for contributing to Bioconductor!

kvittingseerup commented 7 years ago

Dear Valerie (@vobencha)

Thanks for taking the time to review and improve my R package.

/Kristoffer