issues
search
lutteropp
/
hakmer-ng-redesign
0
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Add --earlyStop option that lets the user specify after how much extracted sequence data we want to stop
#76
lutteropp
opened
5 years ago
0
Run hakmer-ng on the supermatrices from Antonis paper
#75
lutteropp
opened
5 years ago
0
Augment blocks with mismatches a bit later? And maybe only if we didn't find enough seeds/ not enough taxa per seed?
#74
lutteropp
opened
5 years ago
0
Put very promising seeds on a fast track/ directly process them
#73
lutteropp
opened
5 years ago
1
Fix yet another '$' in supermatrix bug
#72
lutteropp
closed
5 years ago
1
Compute harmonic mean of average pairwise genome substitution rates
#71
lutteropp
closed
5 years ago
1
With same number of taxa, prefer blocks with lower surrounding subRate
#70
lutteropp
closed
5 years ago
1
-r option causes segfault on cluster
#69
lutteropp
closed
5 years ago
1
Don't search for approximate matches in taxa that were rejected due to paralogy issues
#68
lutteropp
closed
5 years ago
0
Perform an iterative search for approximate matches
#67
lutteropp
closed
5 years ago
1
Speed up search for approximate matches
#66
lutteropp
closed
5 years ago
1
If we have multiple approximate matches in a taxon, don't add any of them
#65
lutteropp
closed
5 years ago
1
Make maximum accepted average substitution rate a dynamically chosen parameter
#64
lutteropp
opened
5 years ago
1
Implement trimming of already aligned extended block
#63
lutteropp
opened
5 years ago
1
Sample taxa more evenly
#62
lutteropp
closed
5 years ago
2
Maybe set the flankSize to the seed size?
#61
lutteropp
closed
5 years ago
1
Something is wrong with the supermatrix built for the w252 dataset
#60
lutteropp
closed
5 years ago
9
Compute statistics about how much sequence data has been extracted from each taxon
#59
lutteropp
closed
5 years ago
0
Vectorize the code (very low priority, but still a fun exercise)
#58
lutteropp
opened
5 years ago
0
w2016 dataset throws malloc memory corruption error
#57
lutteropp
closed
5 years ago
1
Plot sequence data usage for each minimum seed size if we wouldn't care about overlaps/ reusage of sequence data
#56
lutteropp
closed
5 years ago
1
Implement elbow criterion
#55
lutteropp
closed
5 years ago
0
Improve block priority scoring
#54
lutteropp
closed
5 years ago
2
Postpone adding of extracted blocks with all-equal sites in order to favor more informative blocks?
#53
lutteropp
closed
5 years ago
2
Adapt flankwidth to fit the kmer seed size - the larger the seed, the larger the flank size can be
#52
lutteropp
closed
5 years ago
0
If total amount of extracted sequence data is too low (e.g., lower than 10%), do a second run with lower kmin
#51
lutteropp
closed
5 years ago
0
Always write info file
#50
lutteropp
closed
5 years ago
0
Add protein data support
#49
lutteropp
opened
5 years ago
0
Improve dealing with paralogs
#48
lutteropp
opened
5 years ago
4
Perform better block MSA - maybe with MUSCLE?
#47
lutteropp
opened
5 years ago
1
Use average number of substitutions
#46
lutteropp
closed
5 years ago
4
Increase k in a better way than one by one, e.g. bei looking at the lcp-array entries...
#45
lutteropp
closed
5 years ago
0
Don't run MSA on the seeds, but really use that trimming information
#44
lutteropp
closed
5 years ago
1
Still augment seeds with mismatches, but only after all other seeds have been found?
#43
lutteropp
closed
5 years ago
1
Fix missing data statistics computation
#42
lutteropp
closed
5 years ago
0
A block should only store it's relevant sequence coordinates - do the MSA later on
#41
lutteropp
closed
5 years ago
0
Very simple restructuring - if partial extension turns out do be a good idea, then the entire hakmer-ng code can be made MUCH more simple (and probably more efficient, too)...
#40
lutteropp
opened
5 years ago
1
Modify site-selection-criteria scripts to deal with split/ non-consecutive partitions
#39
lutteropp
opened
5 years ago
0
Improve extension: Add trimmed extension, this is, allow for just a subset of taxa in the block to be extended if some taxa already say stop
#38
lutteropp
closed
5 years ago
0
Prune overlapping seeds instead of discarding them
#37
lutteropp
closed
5 years ago
0
Store suffix array in a file and check if it's already there
#36
lutteropp
closed
5 years ago
0
Make param_ranges input possible in order to nnot recompute the suffix array every time
#35
lutteropp
closed
5 years ago
1
Implement variant: Start with large kmin, then gradually reduce the kmin size...
#34
lutteropp
closed
5 years ago
0
Implement variant: Only do approximate match augmentation after all blocks have been chosen and extended
#33
lutteropp
closed
5 years ago
2
Implement variant: First take all seeds, then do extensions
#32
lutteropp
closed
5 years ago
1
Speed up iterative seeded block extraction by only looking for exact counts and skipping uninteresting suffix array regions
#31
lutteropp
closed
5 years ago
1
Think about merging otherwise compatible superseeds, e.g., by pruning one of them
#30
lutteropp
opened
5 years ago
0
Prefer the block with larger k-mer size if the number of taxa in two blocks is the same
#29
lutteropp
closed
5 years ago
0
Store the per-block-MSAs externally in some file, they won't fit into the RAM for large datasets!
#28
lutteropp
closed
5 years ago
0
Improve dealing with reverse complements
#27
lutteropp
closed
5 years ago
1
Next