Closed mmulvahill closed 7 years ago
@smahaffey What are your thoughts on this? I've implemented it, but can revert back to the variable approach if you prefer it.
Reopen for discussion w/ @smahaffey
Good idea. I think that makes sense. Although I will point out for versioning the database we are changing the URL for 2.1 while keeping the URL for v1.x and v2.0 so that if people want to continue with work they started on that version they won't get different results now with the next update.
We could fix this by adding a parameter to the CGI script that takes the database name that we can version. It could default to v1.x/2.0 so that it wouldn't break old versions.
So that brings up a tricky issue... Bioconductor requires (I believe strictly so) new packages be versioned 0.99.0, and then increments to 1.0.0 upon acceptance.
So we wil likely need to maintain separate DB and R Package version #s.
How does versioning work on the DB and URL?
Here's what I know about the versions since I was using it but not really involved with the development until now.
For the Package: Ver 1.0 and 1.0.1 were released initially with human/mouse. Kevin also created v2.0 to add Rat data which we needed for our site. The database wasn’t updated other than adding Rat. He wanted to do more testing and then never released 2.0 publicly. Instead of updating 1.0.1 without Rat support I updated 2.0 to fix the problem with submitting larger lists. For that I left the URL the same because the database(multimir) reflects v2.0. Maybe the next release should be 3.0 and version 1.0 for bioconductor instead of using 2.1.
For the URL: To make it simple I was going to just add a version number on the script file name in the URL to change the database since the database is hard coded in the script.
http://multimir.ucdenver.edu/cgi-bin/multimir2.1.pl vs http://multimir.ucdenver.edu/cgi-bin/multimir.pl
However I think we should just add a parameter so we could keep the URL the same between versions and specify the database version in the request. I think we could make it default to the old version for backwards compatibility with v2.0.1 and earlier.
For the database: The database versions will just be a different database name for example right now the database is multimir but the new version will be multimir_2_1 although we can change the number to match the version we decide to go with.
The versioning requirement for Bioconductor is unfortunate, but we can note the change in version numbers on the website for people that download it directly.
Given that what do you think about:
This sounds good. Making the R package as agnostic to DB version as possible should reduce ongoing maintenance requirements.
multimir_dbVersion(url = getOption("multimir_url"))
multimir_dbInfo(url = getOption("mulitmir_url"),
version = getOption("multimir_dbversion"))
Does this match what you were thinking?
I want to keep usage as intuitive as possible and avoid explicitly requiring the user to consider layers of metainformation or other similar hierarchies, so let me know if you think of any other situations like this.
Yes that's exactly what I was thinking. I can work on implementing the server side, while I wait for some of the scripts to ru. I think to make it as easy as possible to maintain the server side we probably should have 2 URLs the current one that will support the versions prior to bioconductor and then a new one to support future versions. There are two options so we can avoid maintaining the perl script in the future.
Then add a database (multimir_versions) with a table(versions) which lists the name, version, creation date and if it is public yet. I think this should effectively decouple the database, perl script, and R package.
I think at this point we can just eliminate the 2.1 branch it only changed the URL which will change anyway.
My thought was to have the _dbVersion() function execute when loading the package in R -- the load/welcome message would then display the current version, show the default values of other package options, and give provide brief instruction on how to change the version.
For reproducibility sake, we want the user to be able to specify the version at the beginning of their code and have that code execute successfully no matter if the current version has changed -- would this be affected by having two separate URLs?
I'm thinking we should meet in person and talk through these issues. That sound like a good idea? If so, when is a good time for you?
Yeah let's meet and talk about it it. I think we just need to add a URL to support the new version and leave the current one alone for older versions that people have downloaded and installed from source. I will email you about times.
Also since I'm testing the database now I realized we need to have updated .rda files with the cutoff values for each DB in it. So however we set the version needs to get the file name from the DB and set it as an option that's used in the method to get the cuttoff.rda file. Just a note to remind me to account for that.
Let me know when you can meet to discuss it. I'm just about finished with testing the DB v2.1 so I can then modify the server side stuff and help with the R side so it will work with the modified server.
To summarise our discussions today:
I have completed the changes needed to allow using previous or future database versions from one version of the package:
I have added multimir_dbInfoVersions() which gets the version list along with pertinent info.
I have added a method to switch the database version being used. It will display an error message if the switch is unsuccessful. Likely due to wrong version string or server availability.
I've changed the initial setup to request a list of versions and setup to use the most current version. Welcome message displays info about version/updated date or error message if server is unavailable at the moment.
Options for file names for the db schema and the rda file will be loaded from the DB since they are tied to the DB version.
@mmulvahill - I don't have permission to push the changes to GitHub. Also I can't actually get it to build as a package now so I can't fully test it as a package, but testing the methods seems to work. I do not understand the issue I get an error about doc/index.html not existing if I build or try to tar/gzip and install it directly. Any ideas on fixing this error so I can finish testing my changes.
I added a parameter for the list of tables. MySQL seems really slow listing tables and this gets called repeatedly. This should speed everything up.
@smahaffey Thanks for implementing the db version change feature -- It looks great.
Quick FYI -- I'm currently about halfway through refactoring the package to reduce duplicate code, maintainence overhead, and make the pkg structure more consistent with 'best practices'. As a part of this, I'm deprecating all function arguments that are set by our new global options (mostly url
). The idea being that the user should be able to change structural options for accessing the webserver in a consistent way, but it should be hidden for typical usage.
Currently uses a variable local to the package environment. Using getOption to access and options("" = ...) to set may be advantageous in that it: