Closed anshumanmohan closed 1 year ago
I'll point out a few things that I don't love and would love guidance on.
make clean
does not delete them. If there's a cleverer/neater way, please let me know!-turnt --save --env crush_oracle test/handmade/crush*.gfa
and, under test/handmade/, I have crush1.gfa
, crush2.gfa
, and crush3.gfa
. Similar with flip
. The idea is not to run, say, flip
, against the graphs that were made especially for crush
.There was one thing that I was seeking guidance on, but I've moved that into #46 and closed this PR.
- Where to store the handmade files: I currently have them in test/handmade/, and this means that
make clean
does not delete them. If there's a cleverer/neater way, please let me know!
I can't quite figure out which files you think make clean
should remove; can't you just add rm
commands to the clean
target?
- How to run certain algorithms against certain files and still retain sanity: I currently do, for example,
-turnt --save --env crush_oracle test/handmade/crush*.gfa
and, under test/handmade/, I havecrush1.gfa
,crush2.gfa
, andcrush3.gfa
. Similar withflip
. The idea is not to run, say,flip
, against the graphs that were made especially forcrush
.
I don't think there's anything wrong with running each graph across each algorithm; it doesn't seem to take too much time, and it improves the testing coverage of all of our algorithms. However, if you want to be able to run specific tests on an algorithm for debugging purposes, you could give the .gfa
files an extra extension, e.g. .crush.gfa
files are made specifically for crush
, and turnt --env crush_test test/handmade/*.crush.gfa
would run tests on just these files.
This PR improves our testing infrastructure a tad. We are (still) doing differential testing against
odgi
, so agreeing with them is always a good sign, but the problem was that the existing graphs were, in many cases, not proving to be very interesting targets for the algorithms. For instance,crush
was having no effect on any graph,flip
was only flipping one path in one graph, etc.I had hoped to generate a bunch of new graphs that would run our algorithms through their paces, but it turned out there wasn't a ton to do. Here's a summary of how the various algorithms are tested after this PR:
chop
takes a command-line argumentn
for the chopping-size to aim for. I test it by generatingn
randomly using a little snippet of Python:n
is chosen from (2, the length of the graph's longest segment).crush_n
converts "runs" ofN
into a single characterN
. To test it I have added three new files to test/handmade/.flip
. We already disagreed with the odgi output for the graph note5, and we disagree with them on all the graphs that I have crafted. I will explore this further.validate
: I take a gfa, randomly nix 90% of its Links, and then runvalidate
on the newly-crafted gfa.overlap
, where the candidate paths are tested along their entire length, is tested exhaustively at present. The more interesting case, where we query a path along a certain fragment of its length, is not yet implemented inslow-odgi
. See https://github.com/cucapra/pollen/issues/32#issuecomment-1499839056 for more.degree
,depth
,flatten
, andmatrix
are tested reasonably well by the existing graphs.