Closed whedon closed 6 years ago
Hello human, I'm @whedon. I'm here to help you with some common editorial tasks. @andrewyatz it looks like you're currently assigned as the reviewer for this paper :tada:.
:star: Important :star:
If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿
To fix this do the following two things:
For a list of things I can do to help you, just type:
@whedon commands
Attempting PDF compilation. Reticulating splines etc...
Hi @serine, thanks for your interesting submission. @andrewyatz will review. To expedite the process, can you go through above checklist and make sure we ought to be able to tick the boxes? And ping us here when you are done. Thanks!
Hi @serine I've been running your installation procedure and have two issues with it. Could you double check your installation scripts on a clean system and just make sure everything is in order please? I'd rather not be opening lots of issues on your repos if I can avoid it plus it lengthens the review process incredibly.
Ping @serine. The reviewer needs to be able to run the software. Can you provide the documentation to do just that? Should be straightforward with JVM. If you have already done that and can tick above boxes (e.g. for contribution guidelines) spell it out for the readers. Thanks!
@pjotrp and @andrewyatz as I've mention in my previous comment RNAsik
depends on a lot of different tools and I some what assumed that dependencies of every tool don't need to be listed under RNAsik
. In the case of BigDataScripit it depends on java, golang and ant, given that I said BigDataScript is required I assumed it dependencies.
I since updated the docs to reflect all major/crucial dependencies in the docs, that of course requires a few system installs. I've tested the docs out and it worked for me. If you follow quick start RNAsik
should install. As I also mentioned in the docs now, I've only really tested this on ubuntu 16.04 (also works on 14.04). I don't have a mac so can't really test on that, but you wouldn't want to run it on mac anyway given the resources, unless the data is small of course e.g bacteria.
so like I said docs have been updated with new install instruction.
Let me know how it goes. also if you gonna meet some issues on mac do let me know so that I can accommodate mac users too.
thanks
Just to follow up on the previous comment above. virtualenv
isn't strictly necessary if you already have ansible
installed, you can skip virtualenv
step.
Every time you are opening a new shell you'll need to export
both bds and RNAsik
again. You can put those two export lines into your ~/.bashrc
if you like. I personally don't like to pollute my ~/.bashrc
with random exports.
A general comment on bio-ansible
, which I think I already made somewhere, you can change bio.yml
to all.ym
and add --ask-become-pass
which will prompt you for your sudo
password. This will do bunch of apt-get
installs (again, ubuntu specific) following by tools installation. The problem with doing all.yml
this will install potentially many more apt-get
packages than you really don't need for RNAsik
I realise now that this is rather involved installation process. So far this is the best we have. If you have any suggestion and/or comments don't hesitate to open an issue under RNAsik
issues.
cheers
Complicated tools may have complicated setups ;). One solution would be to provide a Docker container (though I am no fan particularly). @andrewyatz are up up for trying? Possibly a VM would be the easiest.
yeh, I'm not sold on docker yet, but I can try getting you a docker image if that's preferred. Let me know @andrewyatz
I'm going to try normally. You have my upmost sympathy with respect to dependency management and bioinformatic programs.
You may want to look into GNU Guix some day. I don't have any problems with the dependency graph.
@andrewyatz how are you doing?
Hi @serine. Looks like we have an impasse. Software deployment is the achilles heel of bioinformatics!
Can you look into a deployment scheme? To me Docker, bioconda or GNU Guix are the systems of choice. Apt should not be used because it interferes with other software on a system. It should be easy to run software - at least a version of it. One advantage of choosing one of the three that it includes dependencies and it that gets promoted for future pipelines too. There is a paper coming out soon on comparing work flows, including conda and Guix. If you are interested I can send you a pre-press.
@pjotrp It is a shame that you guys having issues with running pipeline, but it is great to get this kind of feedback about "ease of use" so to speak. Like I said before I can't imagine any other pipeline will be any easier to install. At the end of the day the pipeline doesn't require any installation at all, just BigDataScript, that can be downloaded as pre-compiled binary. But of course the pipeline of no use if other tools aren't there.
I think docker beta ready. I'd have to dig into bio-ansible a bit (docker image is also build with bio-ansible, but you won't have to do it. I'll make it) I think internally we are would like to head bioconda way, in fact similar to docker, we are some what in alpha, but at the moment there is no conda channel for Golang and so there isn't a clean way of specifying dependency for BigDataScript at the moment.
I did have a look at Guix briefly, I wasn't convinced.. I get a little confused when package manager gets mixed in with distro.. similar to https://nixos.org/nix/ which is both a package manager and a distro..
Yes please I'd like the pre-print if this is ok. Need to be convinced that docker or bioconda is the way to go.. I personally like to compile from source hence bio-ansible, but do understand "ease of use" and quick access is more important
I have a few things piled up and am helping at running a python course next week. Also down here (Melbourne) we having Easter break from the 30th of March, so might be out of touch for a couple of week, but will try to get this all sorted and report back as soon as I can
Cheers
Hi. Sorry this is more my fault that it's just not been possible for me to devote enough time to test through the existing changes that @serine has implemented. Don't take this as a comment on the changes but my availability.
Okay I've resurrected my work and switched to an Ubuntu box. Still no joy. In fact the same error I see on my mac. This is in MonashBioinformaticsPlatform/RNAsik-pipe#22 with the error.
Sorry can I prod again on this. There's little to no chance of me installing this. As far as I am aware there's been no response to the issues I've raised. The error isn't limited to just my mac as otherwise my VM attempts would have worked. Not sure where else to push this now
@andrewyatz sorry about that, I've gone quite, my fault. I just responded to your other issue could you have one more try. Thanks
@andrewyatz It does appear that bio-ansible way to install RNAsik
is a bit too much. Taking advice from @pjotrp and others around me I've created conda package, for linux only at this stage, which should simplify install great deal.
Could you have a look at updated docs
Basically you'll need to install miniconda and then RNAsik from my "channel" (conda terminology here)
bash Miniconda3-latest-Linux-x86_64.sh
conda install -c serine rnasik
conda install -c bioconda qualimap
QualiMap
is separate install, because it itself has fair few dependencies and it was tricky to include into RNAsik
package
On my laptop it took about 35 minutes to install, this is subject to reasonably good internet connection as installation creates fair bit of traffic, downloading all of the different dependencies.
Hopefully this will simplify things for you and others
Cheers
by the way this is my channel page https://anaconda.org/serine and this is rnasik package page https://anaconda.org/serine/rnasik
@andrewyatz sorry, I think you also need to added a couple of channel
conda config --add channels defaults
conda config --add channels conda-forge
conda config --add channels bioconda
before installing from conda
also updated docs.
p.s default location of the config is ~/.condarc
mine looks like this
[biostation2]~$ cat ~/.condarc
channels:
- bioconda
- conda-forge
- defaults
@andrewyatz do you have time to look at this? We can look for another reviewer if necessary.
@pjotrp @andrewyatz Let me guys know if you need anything else from me.
Just to reiterate that the best way to install RNAsik
is to use conda install
. I actually shifted myself and now use conda install
all the time, cause I tend to jump between diff machines.
Also just to let you know that RNAsik
have moved since I submitted for review. I'm now up to 1.5.1
version and making 1.5.2
soonish.. there have been some changes, but all in backward compatible manner.
You can only get 1.5.1
through conda install, but that shouldn't effect the review right?
Cheers
Hi @pjotrp as the delay in my response suggests that I am no longer a good person to do this review. I am really sorry about this and apologise to Kirill and yourself about this. I appreciate the effort you have been spending on improving the process and feel conda is an excellent way to move in
On 29 May 2018, at 23:30, Kirill Tsyganov notifications@github.com wrote:
@pjotrp @andrewyatz Let me guys know if you need anything else from me.
Just to reiterate that best way to install is to use conda install. I actually shifted myself and now use conda install all the time, cause I tend to jump between diff machines.
Also just to let you know that RNAsik have moved since I submitted for review. I'm now up to 1.5.1 version and making 1.5.2 soonish.. there have been some changes, but all in backward compatible manner.
You can only get 1.5.1 through conda install, but that shouldn't effect the review right?
Cheers
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
@andrewyatz no problem, thanks for all the feedback that you've provided so far.
@pjotrp hopefully someone else can help out?
Cheers
Alright, let's look for someone who can run this pipeline, so we can wrap up review.
Hi @serine, sorry about the delay. So, you are saying that a conda install should work. Do you happen to have some example fastq files, i.e, something I can try? I am proposing I run it myself with your help. If it works I can switch to a reviewer role and I'll ask someone else to be the editor. It has happened before.
@pjotrp yes, conda installable. I just wrote quick_install.bash script that obviously does quick install, but I'm also using it with my travisCI testing.
As you can see from quick start section in the docs that you can just run the script and export executables into your PATH. The details of the script are explained under installation section or just cat
the script obviously
The actual test data again explained in data set for testing section and I've updated the README with the command to run, but here it is just in case
RNAsik -fqDir http://bioinformatics.erc.monash.edu/home/kirill/sikTestData/rawData/fqFiles.txt \
-align star \
-fastaRef ftp://ftp.ensembl.org/pub/release-91/fasta/saccharomyces_cerevisiae/dna/Saccharomyces_cerevisiae.R64-1-1.dna_sm.toplevel.fa.gz \
-gtfFile ftp://ftp.ensembl.org/pub/release-91/gtf/saccharomyces_cerevisiae/Saccharomyces_cerevisiae.R64-1-1.91.gtf.gz \
-counts \
-paired \
-all
There are a couple of things I should mention.
RNAsik
is under active development and version 1.5.1 has moved on fair bit with bug fixes and new feature. I once again hard coded RNAsik specific commit into the install script. I'm waiting on bds to make next v2.0rc9
release before making my next release 1.5.2
I understand that there are a lot ifs here. But things are out of my hands with respect to bds development.
Basically if you just run the install script and test command I posted above it will work. Just note that STAR aligner needs fair bit of RAM, the test data set is small and I think 10-15 Gb of RAM should be plenty for mouse or human you really need ~ 32 Gb of RAM
thanks
Forgot to mention new feature in RNAsik
is -fqDir http://bioinformatics.erc.monash.edu/home/kirill/sikTestData/rawData/fqFiles.txt
where -fqDir
can either take a file or directory. If text file is given, fastq files are expected to be one per line and if directory is give it will get traversed recursively looking for fastq files
the content of fqFiles.txt
in this case is
http://bioinformatics.erc.monash.edu/home/kirill/sikTestData/rawData/IndustrialAntifoamAgentsYeastRNAseqData/ctrl1_R1.fastq.gz
http://bioinformatics.erc.monash.edu/home/kirill/sikTestData/rawData/IndustrialAntifoamAgentsYeastRNAseqData/ctrl1_R2.fastq.gz
http://bioinformatics.erc.monash.edu/home/kirill/sikTestData/rawData/IndustrialAntifoamAgentsYeastRNAseqData/ctrl2_R1.fastq.gz
http://bioinformatics.erc.monash.edu/home/kirill/sikTestData/rawData/IndustrialAntifoamAgentsYeastRNAseqData/ctrl2_R2.fastq.gz
http://bioinformatics.erc.monash.edu/home/kirill/sikTestData/rawData/IndustrialAntifoamAgentsYeastRNAseqData/Ind-Z1_R1.fastq.gz
http://bioinformatics.erc.monash.edu/home/kirill/sikTestData/rawData/IndustrialAntifoamAgentsYeastRNAseqData/Ind-Z1_R2.fastq.gz
http://bioinformatics.erc.monash.edu/home/kirill/sikTestData/rawData/IndustrialAntifoamAgentsYeastRNAseqData/Ind-Z2_R1.fastq.gz
http://bioinformatics.erc.monash.edu/home/kirill/sikTestData/rawData/IndustrialAntifoamAgentsYeastRNAseqData/Ind-Z2_R2.fastq.gz
RNAsik
will handle the download for you. Also note that fqFiles.txt
is on url as well, again download and parsing are all handled internally, just run the command as is above.
I'll give it a go this week.
Sorry about the delay. This is on my list now.
great, thanks. 1.5.2 is out now by the way
On Mon, 16 Jul 2018, 19:14 Pjotr Prins notifications@github.com wrote:
Sorry about the delay. This is on my list now.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openjournals/joss-reviews/issues/583#issuecomment-405189048, or mute the thread https://github.com/notifications/unsubscribe-auth/AGY8Q9U4D0onMHu1BieeuH8hwyzI_PKTks5uHFlqgaJpZM4SF7TN .
I installed the software with bioconda bootstrapping from the GNU Guix conda package. Nice work! Mostly worked out of the box. I'll show a writeup when I am done.
@serine - do you have a test dataset I can just download and run? Sorry for the pain, but I don't want to be hunting down NCBI resources. Even a small set will do. I would like a single command to download the data - or does RNASik doe this?
RNAsik -fqDir https://bioinformatics.erc.monash.edu/home/kirill/sikTestData/rawData/fqFiles.txt \
-align star \
-fastaRef ftp://ftp.ensembl.org/pub/release-91/fasta/saccharomyces_cerevisiae/dna/Saccharomyces_cerevisiae.R64-1-1.dna_sm.toplevel.fa.gz \
-gtfFile ftp://ftp.ensembl.org/pub/release-91/gtf/saccharomyces_cerevisiae/Saccharomyces_cerevisiae.R64-1-1.91.gtf.gz \
-counts \
-paired \
-all
Fatal error: /home/pjotr/.conda/envs/joss-review-583/bin/../opt/rnasik-1.5.2/src/sikFqFiles.bds, line 94, pos 9. no FASTQ files were found, check -fqDir, -extn or -pairIds
Stack trace:
error "no FASTQ files were found, check -fqDir, -e ... # /home/pjotr/.conda/envs/joss-review-583/bin/../opt/rnasik-1.5.2/src/sikFqFiles.bds:94
samplesMap = getSamplesMap( fqFiles,samplesSheet ... # /home/pjotr/.conda/envs/joss-review-583/bin/../opt/rnasik-1.5.2/src/RNAsik.bds:174
sorry about that, will have a look at the issue. I thought I tested out and it worked
[bioinformatics1]~/www/sikTestData/rawData$ tar -tf IndustrialAntifoamAgentsYeastRNAseqData.tar
IndustrialAntifoamAgentsYeastRNAseqData/
IndustrialAntifoamAgentsYeastRNAseqData/Ind-Z1_R2.fastq.gz
IndustrialAntifoamAgentsYeastRNAseqData/Ind-Z2_R1.fastq.gz
IndustrialAntifoamAgentsYeastRNAseqData/ctrl2_R1.fastq.gz
IndustrialAntifoamAgentsYeastRNAseqData/ctrl1_R2.fastq.gz
IndustrialAntifoamAgentsYeastRNAseqData/ctrl1_R1.fastq.gz
IndustrialAntifoamAgentsYeastRNAseqData/ctrl2_R2.fastq.gz
IndustrialAntifoamAgentsYeastRNAseqData/Ind-Z2_R2.fastq.gz
IndustrialAntifoamAgentsYeastRNAseqData/Ind-Z1_R1.fastq.gz
[biostation2]~$ wget bioinformatics.erc.monash.edu/home/kirill/sikTestData/rawData/IndustrialAntifoamAgentsYeastRNAseqData.tar
Let me know if that works, thanks
My report: I can report the software installs and runs to satisfaction. In fact, kudos @serine and colleagues for creating such an advanced pipeline which can be run on almost any Linux system out of the box with nice reporting output. I hope people will start using it. Since it is bionconda based you should also look at making it part of the common workflow initiative https://www.commonwl.org/ with @mr-c.
Let me tick the relevant boxes which means we can process the publication for acceptance. @serine, sorry about the delays, it is not the usual, but then your software is not the usual either :). Good stuff.
@whedon generate pdf
Attempting PDF compilation. Reticulating splines etc...
The review process is now complete. To finalize your submission and accept your paper in JOSS, we need two things. First, can you confirm that all references in your bibliography have a DOI (if one exists).
Second, we need you to deposit a copy of your software repository (including any revisions made during the JOSS review process) with a data-archiving service. To do so:
This is great, thanks heaps !
About the second part; I've done Zenodo here https://zenodo.org/record/1315013#.W0_PY3VuaV4
I was a bit confused about which DOI to use.. I just tool whatever was there in the pds that you've generated.
About the first bit can you confirm that all references in your bibliography have a
Looking through the references these tools don't have doi
Broadinstitute. n.d. “Picard-Tools.” broadinstitute. http://broadinstitute.github.io/
picard.
Li, Heng. 2013. “Aligning Sequence Reads, Clone Sequences and Assembly Contigs with
BWA-MEM,” March. http://arxiv.org/abs/1303.3997.
Powell, David. 2015. “Degust: Powerfull and User Friendly Front-End Data Analsysis,
Visualisation and Exploratory Tool for Rna-Sequencing.” github. http://degust.erc.
monash.edu
do I need to do something about this?
Would an RRID URI be acceptable? I like to cite both the idea of a piece of software and any relevant methods papers.
Picard:
https://identifiers.org/rrid/RRID:SCR_006525
Degust:
https://identifiers.org/rrid/RRID:SCR_001878
For bwa-mem I suggest listing both the Arxiv paper and the RRID https://arxiv.org/abs/1303.3997 https://identifiers.org/rrid/RRID:SCR_010910
(There are citation instructions for BWA but it doesn't mention the mem
algorithm: http://bio-bwa.sourceforge.net/ just short
and sw
)
@arfon we are ready for acceptance. Can you shine your wisdom on above? Note to others: Arfon is still on holiday, so it may take a little while.
And thanks @andrewyatz for your review! You did most of the hard work :+1:
great thanks a lot guys, and yes shout out to everyone involved, every little bit of advise, comment and gitissue really helped in shaping this into a better tool !
@mr-c was that comment Would an RRID URI be acceptable? to me or at @arfon cause I'm happy with what ever works
@serine my comment about RRIDs was mostly to JOSS management. I'm just a random person with no special standing here :-)
But, while you're open to feedback from strangers, I would suggest splitting up the Summary section into multiple paragraphs. I find it hard to read as one big block of text.
https://github.com/openjournals/joss-papers/blob/joss.00583/joss.00583/10.21105.joss.00583.pdf
The template is also overlapping the page numbers with the footer.
@mr-c how about now https://zenodo.org/record/1315013#.W1A5FHVuaV4 ?
using
pandoc -o check.pdf --bibliography=paper.bib paper.md
builds to this
not sure how it'll look with JOSS compilation.. also don't think I can do much about footer overlap
if you think this looks ok I can commit this change to master and perhaps ask @pjotrp to rebuild final pdf if that's ok?
@serine much more readable, thank you.
Submitting author: @serine (Kirill Tsyganov) Repository: https://github.com/MonashBioinformaticsPlatform/RNAsik-pipe Version: 1.4.9 Editor: @pjotrp Reviewer: @andrewyatz Archive: 10.5281/zenodo.1403976
Status
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@andrewyatz, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:
The reviewer guidelines are available here: https://joss.theoj.org/about#reviewer_guidelines. Any questions/concerns please let @pjotrp know.
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
paper.md
file include a list of authors with their affiliations?