Closed biocyberman closed 7 years ago
Hi!
Awesome PR. Unfortunately I don't have time to review it just now. But I've done a quick read through, and I think that it looks good. Until I have time to look at this more thoroughly maybe you could get the travis build working? As you will see (here)[https://travis-ci.org/NationalGenomicsInfrastructure/piper/jobs/50278087] it's complaining about not finding some dependencies (scroll to the end of the build log).
Thanks for the PR! :smile:
Hi Johan travis failed because it JHDF5 library to deal with XSQ colorspace data was not uploaded. I will fix that. However, I am thinking about logistics problem. Would it be better to submit PRs to master branch so we can 'communicate' through this common ground? Stable releases can be done via release tags that you have been doing or on a 'stable' branch. If you agree I will close this PR and open a new one to the master branch. PLEASE GIVE YOUR COMMENT ON THIS.
And some more words about the QPipe branch, even though I try to keep the 'upstream' intact, I had to make some changes to simplify the implementation of QPipe branch: To support XSQ and maybe BAM inputs later. I've made some rather fundamental changes in addtion to the one described in the original PR:
InputSeqFileContainer
. Follow this I refactored all uses of ReadPairContainer
in the repo. Most of the cases it is just a matter of renaming because InputSeqFileContainer
is backward-compatible. InputSeqFileContainer
supports more flexible and more generic ways to deal with different input formats.setup.sh
will NOT work if it enables compiling of GATK. An exception of MethoNotFound
will occur because of the outdated version of SLF4J embedded in Queue/GATK. The best way is that I will ask to upgrade SLF4J in Queue/GATK at its origin. commons-lang
, commons-io
package. This is because somehow these two packages in Piper's SBT build intefer with the same package names in Queue/GATK, causing again MethodNotFound
exception when Queue tries to use FileUtils.FileCopy
method. I hope will these changes to be merged into the master branch of Piper. Since it was sudden jump for me to Piper, I had to do several experiments and therefore forgot to commit all changes on time. This results in not so well isolated commits. I will anyways, do a 'clean-up' commit and start with well isolated ones. Back in the head, I think the code base is not too complicated, so it is not to bad to do this clean-up commit. This may challenge your patience :-)
Hi Johan, I have just pushed some new commits to QPipe branch. Notably in commit 894e8ac, I converted the project to Maven-based, which allows straight forward integration with GATK/Queue. This is important because it will be easy to refer to Queue source or including dependency directly from github Maven-based repos (i.e. gatk-projected repo).
Hi!
I'm just dropping a note here to let you know that I've not forgotten this. However I've been bogged down with a lot of other stuff to do lately and I've haven't had the time to look into this. I'll promise to get to it as soon as time permits.
/Johan
No worries Johan, I am in a not-so-different situation.
I've now had a quick read through of the code. and you can see my comments on it inline.
Some things of more general node:
Hi Johan, Thanks for taking time to look at the PR. Regarding the last two points:
config
, I thought that not every end-user should commit back their center specific files. In additions, we can easily parameterize it to pickup a folder outside git repo at build time. That is for the wrapper bash scripts. For xml files, it may be a good idea to create a bash script or an app to read user input and write the files.piper
needs and piper itself in or jar file. It maybe to few details to persuade you for Maven-based. Anyways, I think all usage scenarios of Maven for Piper
have been addressed in the POM file. So, reluctant or not, you have next to none of headache left to worry about Maven. Of course, there are many great other uses with Maven in other project if you feel comfortable with it.I will come back for some more commits later to get you updated with the latest.
Hi Johan I addressed most of the points you commented in the previous commits. As you can see only some non-essential points are left. When all are done, just tell me if you want me to create a PR to the master branch. Otherwise you can just merge to the master branch by yourself.
Hi @biocyberman! I've been feeling guilt about not addressing this for a very long time. I'm very sorry for leaving this hanging here.
The scope of Piper has drifted a bit over time, and now it's mostly being used to power our human whole genome sequencing pipelines - which means that at this point I'm skeptical about it being worth the effort for anyone else to go into the project. I'm going to close this pull request now - but if you feel like you are still interested in extending Piper or if you would like me to put you into contact with some other folks I know who seem to me to be working on things which are more easily adaptable to different centers, send me an e-mail at johan.dahlberg@medsci.uu.se and we can have a chat about that.
Once again I'm so sorry for not coming back to you on this, much, much, sooner.
I would like to suggest merging of the following two commits:
Commit aac74bce7e22
https://github.com/biocyberman/piper/commit/aac74bce7e22137e6fb7a539f21102c7ef5ebf3e
I created a 'config' directory at project base and moved three original files to 'config/upsala':
globalConfig.sh globalConfig.xml uppmax_global_config.xml
. I then modifiedsetup.sh
file to allow this syntax:./setup.sh /path/to/install/piper institution /path/to/otherlibraries/lib NoGATKCompile
An example command is like this:
./setup.sh /space/system/piper afmd /space/system/lib nogatk
Which means:
/space/system/piper
config/afmd
directory. Current possible choices areafmd
andupsala
, but more subdirectories for any new institutions can be added.setup.sh
not to download and compile GATK. This is nicer than commenting/uncommenting the function call.Commit 704a1ac9d26cc686d This is an additional commit to remove dependency on
GenomeAnalysisTK.jar
. This is becauseQueue.jar
already includes content ofGenomeAnalysisTK.jar
. Putting both of them together inlib
causes warning about redundant packages found, such as the one aboutorg.slf4j
bindings.https://github.com/biocyberman/piper/commit/704a1ac9d26cc686d460fd93dc9b551f6ced77f5
My commit tree looks a bit messy, I will try to keep them in good order from now on :)