Closed wfs-lilly closed 4 years ago
I should note that I'm trying to run multiple sessions of linearham in the same singularity instance in parallel -- perhaps there is a shared file.
@wfs-lilly sorry for all the roadblocks. This actually seems to be due to some bad assumptions made in the linearham code about partis cluster indices. We are working on it now and will push a fix for --cluster-ind
as soon as possible. Thanks for your patience!
@wfs-lilly this should be fixed in #59, which also introduced some changes to how you should specify your cluster of interest - documented here.
Importantly, --cluster-ind
as you specified it before will no longer correspond the same cluster with the new --cluster-index
option, as the cluster index is now the index of the cluster in a given partition (see documentation linked above for more on this).
To list the clusters in the best partition from partis, run:
scripts/parse_cluster.py \
partition.yaml \
--fasta-output-file parsed_cluster.fa \
--yaml-output-file parsed_cluster.yaml
on your partis output file.
Let me know if you have any questions about how this applies to your case. Hope this helps!
@wfs-lilly closing but feel free to re-open if you are still having any issues related to the cluster index not solved by the changed introduced by #59. Thanks!
I'm trying to run linearham against the output of a partis partition. I started one job specifying cluster 0 and realized that it will take a very long time with the default parameters, so I picked a smaller cluster (sort of at random), reduced the tuning iterations and tried that analysis. My command line is:
Singularity linearham_latest.sif:~/v5x3838/linearham/linearham> scons --run-linearham --cluster-ind=10201 --partis-yaml-file=TSPN-m6-dedup2.part.yaml --parameter-dir=_output/._data_TSPN-Bulk-VH-M6_S1_L001_RM_001.Annot.seq.dedup2/ --template-path=templates/revbayes_template.rev --num-cores=4 --tune-iter=500 &
I get the following output (note that I also had a parallel job with the same command line running on cluster 15, so the outputs could be interleaved and there is definitely a message from the cluster 15 job -- apologies for the confusion; the same failure ultimately occurred for the cluster 15 job but I'm just showing the output for the cluster 10201 job).
I don't know how to address this error; I also note that the sequences listed in the error do not belong to cluster 10201, but to cluster 0 -- there are 1749, vs. the expected 13. The cluster 10201 analysis output folder does seem to have the correct set of sequences.
-bash-4.2$ pwd /home/v5x3838/v5x3838/linearham/linearham/output/cluster10201 -bash-4.2$ ls cluster_seqs.fasta mcmciter10000_mcmcthin10_tuneiter500_tunethin100_numrates4_seed0 -bash-4.2$ fgrep -c Seq cluster_seqs.fasta 13
/home/v5x3838/v5x3838/linearham/linearham/output/cluster0 -bash-4.2$ fgrep -c Seq cluster_seqs.fasta 1749 -bash-4.2$ head cluster_seqs.fasta
Note that the sequence numbers from cluster 0 are indeed the numbers appearing in the cluster 10201 output.
I'll go try to review the sequence lengths in a later addition to this issue.
Run log here