walaj / SeqLib

C++ htslib/bwa-mem/fermi interface for interrogating sequence data
http://bioinformatics.oxfordjournals.org/content/early/2016/12/21/bioinformatics.btw741.full.pdf+html
Other
133 stars 36 forks source link

Could you settle with official bwa and fermi-lite libraries #13

Open tillea opened 7 years ago

tillea commented 7 years ago

Hi, I intend to package freebayes for Debian and the latest version is using SeqLib. So I tried to package SeqLib as well and was stumbling about the problem that BWA and FML have conflicting bseq1_t declaration. It turned out that you using personal forks of BWA and FML, not the upstream ones. When looking for bseq1_t in the SeqLib clones of BWA and FML, no definition appears. However, they do appear in the respective upstream repositories. Looks like you did patch the BWA and FML code bases to be able to mix them without a multiple-definition error. Do you see any chance to solve this to make BWA and FML co-exist nicely? Kind regards, Andreas.

walaj commented 7 years ago

Hi Andreas, It would be great to have this packaged for Debian. If I understand you correctly, you are looking to link to the existing FML and BWA libraries, rather than the SeqLib-specific ones from my personal forks. If you have a suggestion for how I could resolve the bseq1_t multiple definition error during linking, without redefining them in either BWA or FML, I'd be happy to do that. I wasn't able to come up with a good solution, and just as you saw, it led me to make my own forks of the repositories so I could fix multiple definition issues.

Another solution (better) would be to have freebayes use a SeqLib-lite version without FML/BWA, since it doesn't actually use the BWA or FML libraries but rather the htslib wrapper functionality. I'd have to work with the freebayes people on this.

tillea commented 7 years ago

Hi Jeremiah,

thanks for your very quick reply.

On Fri, Jan 27, 2017 at 07:25:55AM -0800, Jeremiah Wala wrote:

It would be great to have this packaged for Debian. If I understand you correctly, you are looking to link to the existing FML and BWA libraries, rather than the SeqLib-specific ones from my personal forks. Yes, that's correct.

If you have a suggestion for how I could resolve the bseq1_t multiple definition error during linking, without redefining them in either BWA or FML, I'd be happy to do that. I wasn't able to come up with a good solution, and just as you saw, it led me to make my own forks of the repositories so I could fix multiple definition issues. I think we possibly need to talk with BWA and FML authors. I might try this and keep you updated.

Another solution (better) would be to have freebayes use a SeqLib-lite version without FML/BWA, since it doesn't actually use the BWA or FML libraries but rather the htslib wrapper functionality. I'd have to work with the freebayes people on this. This would be a great intermediate solution for my initial target but finally having SeqLib packaged for Debian would be interesting for the Debian Med team (which cares for free software in life sciences in Debian).

Kind regards

 Andreas.

-- http://fam-tille.de

ghisvail commented 7 years ago

Another solution (better) would be to have freebayes use a SeqLib-lite version without FML/BWA

This is bad. We don't need more vendored modified versions of libraries. The solution should come from upstream FML / BWA which are incompatible at the moment.

it led me to make my own forks of the repositories so I could fix multiple definition issues.

Did you consider reporting this issue upstream at some point? If not, you should have. If yes, please point us to the relevant thread so we don't repeat ourselves.

walaj commented 7 years ago

@ghisvail Fair enough re: modified SeqLib. I was thinking that for FreeBayes it doesn't really need to link to BWA/FML, but separating this out is not necessary if we just resolve the issue for the single fully-complete SeqLib.

I have not reported the bseq1_t issue to either bwa or fml. Probably should have, but didn't. I can submit the issue report, but will probably need to make a PR that implements the name-switching.

tillea commented 7 years ago

Hi, as you might have noticed in issue #3 of fermi-lite I proposed a patch that solved the issue with bseq1_t. When I build the Debian package of fermi-lite with this patch I'm running in a different error which smells similar but not caused by fermi-lite as far as I can see:

BFC.cpp:351:5: error: ‘ch’ was not declared in this scope
     ch = fml_count(n_seqs, m_seqs, bfc_opt.k, bfc_opt.q, bfc_opt.l_pre, bfc_opt.n_threads);
     ^~
BFC.cpp:351:28: error: ‘m_seqs’ was not declared in this scope
     ch = fml_count(n_seqs, m_seqs, bfc_opt.k, bfc_opt.q, bfc_opt.l_pre, bfc_opt.n_threads);
                            ^~~~~~
BFC.cpp:351:90: error: ‘fml_count’ was not declared in this scope
     ch = fml_count(n_seqs, m_seqs, bfc_opt.k, bfc_opt.q, bfc_opt.l_pre, bfc_opt.n_threads);
                                                                                          ^
BFC.cpp: In member function ‘void SeqLib::BFC::correct_reads()’:
BFC.cpp:369:5: error: ‘es’ was not declared in this scope
     es.ch = ch;
     ^~
BFC.cpp:369:13: error: ‘ch’ was not declared in this scope
     es.ch = ch;
             ^~
BFC.cpp:370:15: error: ‘bfc_opt’ was not declared in this scope
     es.opt = &bfc_opt;
               ^~~~~~~
BFC.cpp:372:15: error: ‘m_seqs’ was not declared in this scope
     es.seqs = m_seqs;
               ^~~~~~
BFC.cpp:377:50: error: ‘bfc_ch_hist’ was not declared in this scope
     int mode = bfc_ch_hist(es.ch, hist, hist_high);
                                                  ^
BFC.cpp:394:29: error: ‘BFC_EC_MIN_COV_COEF’ was not declared in this scope
     bfc_opt.min_cov = (int)(BFC_EC_MIN_COV_COEF * kcov + .499);
                             ^~~~~~~~~~~~~~~~~~~
BFC.cpp:403:31: error: ‘kmer_correct’ was not declared in this scope
     kmer_correct(&es, mode, ch);
                               ^
BFC.cpp: In member function ‘void SeqLib::BFC::FilterUnique()’:
BFC.cpp:416:6: error: ‘m_seqs’ was not declared in this scope
  if (m_seqs[i].seq)
      ^~~~~~

Any hint how to work around this? Kind regards, Andreas.

walaj commented 7 years ago

Hi Andreas, It looks like you renamed it fml_seq1_t in your patch, whereas I renamed it fseq1_t in my fork. BFC.h is expecting m_seqs to be an array of fseq1_t. Changing fml_seq1_t to fseq1_t in your patch would probably allow SeqLib to compile without having to change SeqLib.

tillea commented 7 years ago

Hi Jeremiah, On Thu, Feb 02, 2017 at 07:29:26AM -0800, Jeremiah Wala wrote:

It looks like you renamed it fml_seq1_t in your patch, whereas I renamed it fseq1_t in my fork. Yes. I did this intentionally and also patches libSeqLib here locally according to this. I think this follows a speaking naming convention more closely and there were other variables in fermi-lite which also use the fml_ prefix which makes a better consistency.

BFC.h is expecting m_seqs to be an array of fseq1_t. Changing fml_seq1_t to fseq1_t in your patch would probably allow SeqLib to compile without having to change SeqLib. BTW, I did another patch I've basically stolen (hopefully corretly) from your fork. This strips bfc.h from bfc.c which seems to be used in libSeqLib. Unfortunately I had to run and leave my desk so this is totally untested and will probably not work. Just to let you know that I was busy doing this. You can find the patch here: https://anonscm.debian.org/cgit/debian-med/fermi-lite.git/tree/debian/patches/bcf_seqlib.patch

Kind regards, Andreas.

-- http://fam-tille.de

tillea commented 7 years ago

Hi again, with the patches for a potential Debian package at https://anonscm.debian.org/cgit/debian-med/fermi-lite.git/tree/debian/patches I can build SeqLib. I also removed jsoncpp and ssw code copies. I wonder what header files should be installed into a potential Debian package. It would be great if you would review these patches that would work with the suggested fermi-lite patches. Kind regards, Andreas.

walaj commented 7 years ago

Hi Andreas, I am seeing the fermi-lite repos (https://anonscm.debian.org/git/debian-med/fermi-lite.git), but when I clone I am still seeing the old bseq1_t structure. I know I'm missing something here. Could you guide a bit as to what I should be looking for? From what I understand, I should be trying to see if I can point SeqLib to the patched fermi-lite repos you put together? Are there further modifications to SeqLib that I can review? Any more instructions or a SeqLib PR would be helpful for my understanding. Thank you for your work on this. Best, Jeremiah

tillea commented 7 years ago

Hi Jeremiah On Sun, Feb 05, 2017 at 08:31:34AM -0800, Jeremiah Wala wrote:

I am seeing the fermi-lite repos (https://anonscm.debian.org/git/debian-med/fermi-lite.git), but when I clone I am still seeing the old bseq1_t structure. I know I'm missing something here. Could you guide a bit as to what I should be looking for? From what I understand, I should be trying to see if I can point SeqLib to the patched fermi-lite repos you put together? Are there further modifications to SeqLib that I can review? Any more instructions or a SeqLib PR would be helpful for my understanding. In the Debian packaging all changes are done in debian/patches. You can apply these by quilt push -a Thank you for your work on this. You are welcome - feel free to keep on asking if something remains unclear

walaj commented 7 years ago

Hi Andreas, I just wanted to let you know that I haven't forgotten about this, and I appreciate your help and patience. I have some big project deadlines that have taken all my free cycles until then, but will look more into this soon after those are cleared out. Best, Jeremiah

ekg commented 7 years ago

Is there any way I can help?

On Sun, Feb 26, 2017, 11:16 PM Jeremiah Wala notifications@github.com wrote:

Hi Andreas, I just wanted to let you know that I haven't forgotten about this, and I appreciate your help and patience. I have some big project deadlines that have taken all my free cycles until then, but will look more into this soon after those are cleared out. Best, Jeremiah

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/walaj/SeqLib/issues/13#issuecomment-282592758, or mute the thread https://github.com/notifications/unsubscribe-auth/AAI4EVfDsWPbMDM-IbZVJf4U6HP-h-E7ks5rgfnQgaJpZM4Lv3TA .

walaj commented 7 years ago

Thanks so much Eric and Andreas, What would be most helpful would be to have the changes formatted as a Git pull request. I've been meaning to learn the Debian system, but haven't had a block of time yet to do so (PhD thesis/defense due imminently...). I imagine it's not difficult at all, I'm just being unusually stingy with time this particular month.

tillea commented 7 years ago

Hi Jeremiah, I admit creating my private clone of SeqLib just to enable you applying a patch does not seem sensible to me. I've done changes in my local clone and exported it via git format-patch. You can easily unzip this zipfile and do git am 0001-Avoid-name-space-conflict-with-official-bwa-and-ferm.patch which imports the patch. Please let me know if you prefer a pull request anyway and I'll try to do my best to make your work as easy as possible. Hope this helps, Andreas.

walaj commented 7 years ago

Ok getting closer here. I was able to apply your patch to SeqLib (per latest instructions) and see the changes (e.g. fseq1_t --> fml_seq1_t). What I need now is to compile SeqLib, but using the new fermi-lite repos with your patch. That's where I'm not sure where to find. I cloned the repository (git clone git://anonscm.debian.org/debian-med/fermi-lite.git), but wasn't able to figure out how to apply the patch. I have quilt now, but could use some guidance on as to where to obtain the fermi-lite patch file. I assume I then download the patch file to the cloned fermi-lite directory, and then just run quilt push -a? Am I understanding this correctly?

tillea commented 7 years ago

On Fri, Mar 03, 2017 at 08:45:31AM -0800, Jeremiah Wala wrote:

Ok getting closer here. I was able to apply your patch to SeqLib (per latest instructions) and see the changes (e.g. fseq1_t --> fml_seq1_t). What I need now is to compile SeqLib, but using the new fermi-lite repos with your patch. That's where I'm not sure where to find. I cloned the repository (git clone git://anonscm.debian.org/debian-med/fermi-lite.git), but wasn't able to figure out how to apply the patch. I have quilt now, but could use some guidance on as to where to obtain the fermi-lite patch file. I assume I then download the patch file to the cloned fermi-lite directory, and then just run quilt push -a? Am I understanding this correctly? It is correct that git clone packaging_git plus quilt push -a applies all patches. However, the packaging itself has a different strategy to link to third party software. It seems that rebuilding git subrepositories became a favourite way to compile third party software. This strategy is not acceptable in Debian since we create library packages from each project separately and thus we apply patches to create a "proper" build system. So I also fixed seqlib build system to create dynamic libraries using libtool and also added pkg-config support. The same is true for fermi-lite. While this is correct for Debian it has the consequence that your build strategy might be affected by some patches and thus applying all patches is possibly not what you want. You can easily disable patches by commenting / removing lines in debian/patches/series. On the other hand it would be probably best if fermi-lite would do the changes proposed in issue 5 and than you can keep on following your strategy. Hope this explanation helps, Andreas.

tillea commented 7 years ago

Hi, just to let you know that the new development cycle of Debian has started now after Debian 9.0 was released. I've uploaded the patched versions - but we need to convince the fermi-lite author to apply our patches ...

On Fri, Mar 03, 2017 at 08:45:31AM -0800, Jeremiah Wala wrote:

Ok getting closer here. I was able to apply your patch to SeqLib (per latest instructions) and see the changes (e.g. fseq1_t --> fml_seq1_t). What I need now is to compile SeqLib, but using the new fermi-lite repos with your patch. That's where I'm not sure where to find. I cloned the repository (git clone git://anonscm.debian.org/debian-med/fermi-lite.git), but wasn't able to figure out how to apply the patch. I have quilt now, but could use some guidance on as to where to obtain the fermi-lite patch file. I assume I then download the patch file to the cloned fermi-lite directory, and then just run quilt push -a? Am I understanding this correctly?

tillea commented 6 years ago

Any news here? Do you need further help?