t-neumann / slamdunk

Streamlining SLAM-seq analysis with ultra-high sensitivity
GNU Affero General Public License v3.0
39 stars 23 forks source link

Implement mixture model compatibility #154

Closed isaacvock closed 6 months ago

isaacvock commented 6 months ago

I still need to run some tests and update the documentation accordingly, but this is my idea for how to address #153. After diving into the code base more, it seemed like the proposed change would make more sense as an optional addition to the count dunk. Currently I have it set to the non-default option, but I don't think it wouldn't be a bad idea to create this output by default; it should only incur a minimal additional RAM and runtime penalty, though there are some less trivial disk space requirements (depends on bam file size, probably in the 100 MB - 1GB range for your average bam file).

Let me know if you oppose this implementation strategy, or if anything stands out as obviously buggy with this implementation. If you are ok with the changes, I will work to run some tests, edit the documentation, and update the pull request.

t-neumann commented 6 months ago

@isaacvock In principal fine with the changes, but could you please not merge it into master but maybe a new 'cb' branch, so we keep the development there until it's fully tested?

isaacvock commented 6 months ago

Of course, but I need you to make said branch in order to merge into it

t-neumann commented 6 months ago

Sorry yes, done

isaacvock commented 6 months ago

I'm sorry for not communicating this better, but I was expecting the workflow to be:

  1. Make pull request (don't merge yet) so that you can provide any broad stroke feedback on the intended changes
  2. Run tests on my end
  3. Update pull request with any edits necessary to pass tests. Also willing to run any additional tests you have in mind. I have a fastq file from a random publication that I processed with SLAMDUNK + new functionality + bakR and that I was going to process with GRAND-SLAM and bam2bakR + bakR. Idea is to compare fraction new (aka NTR) estimates across the pipelines to ensure good concordance.
  4. Inform you once all tests are passing on my end
  5. Update pull request to include necessary changes to the documentation
  6. You merge the pull request

What trajectory did you have in mind? The initial pull request still didn't implement the alternate pipeline in slamdunk all, just count; so I'll have to make a new pull request with those changes. I think it would be best to avoid having lots of different pull requests all aimed at the same final goal, so would you be ok with following the steps outlined above in this next pull request?

t-neumann commented 6 months ago

Hi - actually that was a rather brainless merge indeed, my apologies, you are right.

Since the cb branch is anyways not in production, it doesnt matter too much but yes please submit another PR with the next changes and this time I won't merge them in the cb branch until you give me the go