BioJulia / Bio.jl

[DEPRECATED] Bioinformatics and Computational Biology Infrastructure for Julia
http://biojulia.dev
MIT License
261 stars 65 forks source link

Release v0.2! #165

Closed bicycle1885 closed 8 years ago

bicycle1885 commented 8 years ago

Since the last development release of v0.1, we have merged lots of new features and bug fixes into the master branch. Why don't we release v0.2?

I think we need to do the following things at least:

Merging #163 depends on https://github.com/BioJulia/BufferedStreams.jl/pull/11 and https://github.com/BioJulia/Libz.jl/pull/21. Suppressing deprecated warnings can be done with Compat.jl (I can do that). Precompilation support would be straightforward once we support the precompilation of Libz.jl.

In addition, I'd like to include #152 for sequence search, which is almost finished and would be indispensable for bioinformatics. Other advanced search features (#153 and #143) are still not completed , so I want to postpone them to v0.3.

Any thoughts or suggestions?

TransGirlCodes commented 8 years ago

This sounds reasonable to me. This also means v0.3 can be when phylogenetic and population genetics is merged (which fits my timescale and other projects involving these two functionalities better), but in v0.2 at least people have the latest Seq.

On Tue, May 3, 2016 at 7:44 PM, Kenta Sato (佐藤 建太) <notifications@github.com

wrote:

Since the last development release of v0.1, we have merged lots of new features and bug fixes into the master branch. Why don't we release v0.2?

I think we need to do the following things at least:

Merging #163 https://github.com/BioJulia/Bio.jl/pull/163 depends on BioJulia/BufferedStreams.jl#11 https://github.com/BioJulia/BufferedStreams.jl/pull/11 and BioJulia/Libz.jl#21 https://github.com/BioJulia/Libz.jl/pull/21. Suppressing deprecated warnings can be done with Compat.jl (I can do that). Precompilation support would be straight forward once we support the precompilation of Libz.jl.

In addition, I'd like to include #152 https://github.com/BioJulia/Bio.jl/pull/152 for sequence search, which is almost finished and would be indispensable for bioinformatics. Other advanced search features (#153 https://github.com/BioJulia/Bio.jl/pull/153 and #143 https://github.com/BioJulia/Bio.jl/pull/143) are still not completed , so I want to p ostpone them to v0.3.

Any thoughts or suggestions?

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/BioJulia/Bio.jl/issues/165

bicycle1885 commented 8 years ago

The current master branch has many new features that are not included in v0.1:

timbitz commented 8 years ago

Does this mean that Biosequence will become the new default? I only ask because currently I have some code that relies on the FASTQ parser creating sequences in the old 2-bit encoding.
Can you think of any way that might make it possible to specify a particular encoding to the parser if the 4-bit encoding is to replace 2-bit? (I am even happy with N's being encoded as A's since there is a quality score for FASTQ)?

bicycle1885 commented 8 years ago

Yes, BioSequence type with 4-bit encoding is the new default type for DNA sequences. You can convert encoding between 2-bit and 4-bit encoding using the convert method. I don't know how your code depends on the underlying encoding, but if you don't directly touch the .data field of a sequence, there is no difference between these two encodings.

I think it is a useful feature to specify an encoding to the FASTA and FASTQ parsers. I will consider a clean API to support that, but implicitly replacing Ns with As looks awkward to me.

timbitz commented 8 years ago

Interesting. Yah, I have only minimally played with the new BioSequence, so I will dive a bit deeper and see what was going on. The API to specify encoding to FASTA/FASTQ would be great. This would remove the overhead of applying convert on each read no?

Fair enough about the Ns to A's... though in a sense N's are almost redundant in a FASTQ sequence, since each base is accompanied by a probability, and N presumably just means equal probability of all four bases.

TransGirlCodes commented 8 years ago

Is there functionality to read in sequences as 2 bit without the intermediate conversion? People at work will probably ask me this when I tell them.

bicycle1885 commented 8 years ago

No, but I agree that we need it to avoid extra cost of conversion and allocation.

N can be any nucleotide and converting it to A makes sense in most situations. I believe that kind of implicit behavior will lead to hard-to-find bugs in some situations. But it would be great to have that as an opt-in option to the parser.

blahah commented 8 years ago

Completely agree about releasing v0.2 given all the new functionality.

kdm9 commented 8 years ago

Can I ask for a new release of Switch.jl to conincide with Bio.jl's 0.2 release?

bicycle1885 commented 8 years ago

I'm not responsible for Switch.jl. @dcjones has control on it.

kdm9 commented 8 years ago

OK. Is he still buried under 6 feet of PhD (if so, commiserations @dcjones!)? Would he consent to the repo moving under the biojulia umbrella?

bicycle1885 commented 8 years ago

Running Bio.jl doesn't work well on Julia 0.5-dev. More and more functions are going to be deprecated before the release of Julia 0.5. I think I need to delay the new release of Bio.jl until a release candidate of Julia 0.5 is available.

bicycle1885 commented 8 years ago

I recently saw a discussion about a package development (https://github.com/JuliaLang/julia/issues/16681). I really understand sfchen's annoyance; it is really painful to support two different Julia versions, especially cutting-edge of the master branch. It also impedes our development of Bio.jl, and I'm now inclined to support only a single Julia version in Bio.jl. Since Bio.jl is getting larger, it will continue to become more painful until Julia reaches v1.0.

My suggestion is following the development style of JuliaOpt (https://github.com/JuliaLang/julia/issues/16681#issuecomment-222821968):

So, taking this style, the remaining tasks towards Bio.jl 0.2 would be:

These tasks are quite easy and can be finished within a few days.

Once Julia 0.5 (or pre-release) is released, we can immediately switch to the new Julia compiler without caring about compatibility.

TransGirlCodes commented 8 years ago

Sounds reasonable to me.

On Wed, Jun 1, 2016 at 7:03 AM, Kenta Sato (佐藤 建太) <notifications@github.com

wrote:

I recently saw a discussion about a package development ( JuliaLang/julia#16681 https://github.com/JuliaLang/julia/issues/16681). I really understand sfchen's annoyance; it is really painful to support two different Julia versions, especially cutting-edge of the master branch. It also impedes our development of Bio.jl, and I'm now inclined to support only a single Julia version in Bio.jl. Since Bio.jl is getting larger, it will continue to become more painful until Julia reaches v1.0.

My suggestion is following the development style of JuliaOpt (JuliaLang/julia#16681 (comment) https://github.com/JuliaLang/julia/issues/16681#issuecomment-222821968):

  • Support only the latest stable release of Julia.
  • Do not test on the development branch of Julia.

So, taking this style, the remaining tasks towards Bio.jl 0.2 would be:

  • Create release-0.2 branch from the current master.
  • Limit Julia requirement to 0.4 only.
  • Revert commits that were introduced for the sake of Julia 0.5 compatibility.
  • Tag and release Bio.jl 0.2.

These tasks are quite easy and can be finished within a few days.

Once Julia 0.5 (or pre-release) is released, we can immediately switch to the new Julia compiler without caring about compatibility.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/BioJulia/Bio.jl/issues/165#issuecomment-222900840, or mute the thread https://github.com/notifications/unsubscribe/ADPejTfTl8_raT050fELg5y40AAIp7-iks5qHSCfgaJpZM4IWgYQ .