matsengrp / cft

Clonal family tree
5 stars 3 forks source link

Build and deploy latest data #230

Closed metasoarous closed 5 years ago

metasoarous commented 6 years ago

Laura has some new timepoint data that @psathyrella is working on building. It might be nice to get this built out with the yaml format #221. Does that seem reasonable Duncan? What's your ETA on having that data built?

metasoarous commented 6 years ago

@psathyrella I'm assuming this is the v15/v4 build? Is there an ETA on completion there?

lauradoepker commented 6 years ago

Nope - those builds (v15, v4)were solely for BF520.1 samples that weren’t downsampled before running. Now Duncan has fixed his side so that he never has to downsample unless needed (he caps cluster size) and new runs (that aren’t planned in ink yet) would be v16 etc. So the newest data for this issue are indeed v14 (Laura-mb), v3 (Laura-mb-2), and v1 (qa255-synth).

The v1 QA255 data is what we were waiting for and that's almost done: @psathyrella just needs to run some stuff with crashing light chain clusters and also complete best-plus-clustering steps for a few. Everything else is there already.

Duncan is also running QA013 in the future from our latest chip, but that's down the line a little bit.

metasoarous commented 6 years ago

Ok; Thanks for clarifying. I'll wait on running qa255-synth till I get the word.

I'm assuming you still want kate-qrs/v14 included as well?

lauradoepker commented 6 years ago

Yes please. I’ll check in with @psathyrella on status

metasoarous commented 6 years ago

@lauranoges I think you have everything you need now, but I know you were kind of busy with other things and hadn't had a chance to check yet. Let me know when you do, and if everything looks good feel free to close this issue.

lauradoepker commented 6 years ago

I think things look good from a CFT standpoint. There is one red flag I'm seeing where one of the computer seeds (CARGPFPNYYGPGSYWGGFDYW) isn't properly pulling in the mature seed BF520.1 heavy sequence for IgG cluster, but I think that's partis and not CFT? @psathyrella any thoughts?

psathyrella commented 6 years ago

BF520.1 is in my output for that computer seed, see here, generate with

./datascripts/run.py seed-partition --study bf520-synth --extra-str=v17 --logstr computer-seeds --seed-origin computer --only-merged --view-ascii --seed-uids CARGPFPNYYGPGSYWGGFDYW --loci igh

but I apparently forget what we're doing here, because shouldn't it not be? I mean I always specify all the seed sequences with --queries-to-include, but didn't we pick the computer seeds to be different clonal families to bf520.1?

EDIT: it turns out this computer seed was a control chosen specifically to be in the bf520.1 lineage, so that part is no longer confusing.

lauradoepker commented 6 years ago

Thanks for the clarification and digging. @metasoarous then this might indeed be a CFT issue - can you find out why BF520.1 sequence wasn't included in the pared-down clusters of computer seed CARGPFPNYYGPGSYWGGFDYW?

metasoarous commented 6 years ago

I'm not seeing any of the real seeds showing up in the computer seeds data, which makes me think that these sequences maybe got renamed for the computer seeds run?

metasoarous commented 6 years ago

I read a little more carefully above, and looked more carefully at the partis output files for this synthetic cluster. It looks to me like the issue is just that for the partition step 0, BF520.1 actually falls in a different cluster from CARGPFPNYYGPGSYWGGFDYW. The two don't merge until partition step 1. So no mystery here I think.