bmansfeld / QTLseqr

QTLseqr is an R package for QTL mapping using NGS Bulk Segregant Analysis
64 stars 42 forks source link

License Request #68

Closed cmatKhan closed 6 months ago

cmatKhan commented 6 months ago

I am interested in using and modifying this code to make use of bioconductor base classes (summarizedexperiment, genomicranges, etc). I'd like to copy some code directly.

However, there is no license. Strictly speaking, this is not open source without one, and I should not use the code as a result.

https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/licensing-a-repository#choosing-the-right-license

The lab I work in does use this software and have cited the paper(eg https://doi.org/10.1016/j.chom.2023.10.002). But, it would make me more comfortable using the code if it did have an open source license.

If I do have the time to refactor the codebase into a more bioconductor-centric package that is general purpose, rather than purpose built for our experiments, I would be interested in adding it to bioconductor. I would be more than happy to issue a pull request to this repo and then work with you guys to do that, if you are interested. Similarly, I won't do it, and I won't copy code, if you don't add a license -- I'll take that as a 'bugger off!' and no hard feelings -- though of course we'd still cite the paper.

bmansfeld commented 6 months ago

Hey Chase, When I built this package in grad school I added the license as part of the package description file and never updated a specific license readme or anything else. I should probably add that... I consider it under GPL-3 as you can see. I appreciate your respectful approach to the open source-ness of the package.

In any case, it would be very cool to update QTLseqr to be compatible with bioconductor. In my original plan I was interested in doing that, but there was so much to learn and I wasn't ready to conform to their strict guidelines at the time. Additionally, I found that many non-model species (specifically plants which I work in) do not have a lot of support in bioconductor classes and so most folks don't end up building them and thus end up not using these great tools.

That being said I have a better idea. I am not sure that you are aware, but I recently started as a prof at WUSTL in the Biology dept. We should meet up and chat about possibilities to collaborate on a QTLseqr v2 software I have some ideas and need some coding support.

Happy to chat more. Feel free to email me at bmansfeld at wustl

Ben

cmatKhan commented 6 months ago

I didn't think to look in the DESCRIPTION -- my bad.

I did not know that you are a prof at WUSTL -- that is convenient. Yes, let's meet. I'll send an email on Monday to schedule a time.

I wrote the SummarizedExperiment class for the GATK varianttotable output on Friday. I have never worked with plant genomes, but I think that this will work. It is just a normal summarizedexperiment where the DP, AD, etc are separate matrices in the assays slot. the rowRanges are the variants represented as GRanges and the colData is the sample metadata. The comparisons between 'bulks' are stored as a table as an attribute -- that will be easier to explain in person. I haven't started on the stats/plotting functions yet, but those will be straight forward to adjust to take the summarizedexperiment obj as input.

The rest of that repo is a re-write of the code that was used for the paper linked above, and not originally mine. Aside from that SummarizedExperiment object, in other words, there isn't anything there that would be 'v2' material.