matsengrp / cft

Clonal family tree
5 stars 3 forks source link

Build top 50 families if possible #258

Closed eharkins closed 5 years ago

eharkins commented 5 years ago

Laura expressed the wish for more clonal families to be built per sample if possible. @metasoarous has the depth, aka, number of clonal families processed per sample at 20 now, but if resources / optimizations allow, 50 is the goal.

metasoarous commented 5 years ago

@eharkins Once things are at a good place with CFT/Olmsted, let's go ahead and carry out this build. And again, please let me know if you have any questions or need to be pointed to my meta- build-script again.

eharkins commented 5 years ago

@metasoarous

carry out this build

On some specific dataset? I thought this was a general hope for increased depth in the future

metasoarous commented 5 years ago

We may be able to run at depth 50 right now if you do it one dataset at a time as in the script at /home/matsengrp/working/csmall/cft/build.sh. I haven't tried yet.

I think ideally we'd want all of the datasets we could get.

matsen commented 5 years ago

I didn't realize that we were being bound by SCons in this way. I'm definitely in favor of using whatever hacks are needed to make things work now, bute need to keep thinking about this big-picture.

eharkins commented 5 years ago

Maybe we can try to use https://github.com/nhoffman/bioscons for optimization? In terms of building all the data we have right now, is it worth waiting to merge things in progress like #265 , #270 ?

metasoarous commented 5 years ago

I'm not sure how bioscons helps us here. Did you have something particular in mind?

eharkins commented 5 years ago

No I didn't, @matsen mentioned it and that it allows use of slurm job submission / scheduling as opposed to just having scons -j.

metasoarous commented 5 years ago

Right, but we're already using slurm using more or less the same trick as in bioscons.

eharkins commented 5 years ago

Ah, ok. That much was not clear to me.