Open heringerp opened 1 month ago
Yes, I think this is a very nice idea.
Running panacus on compact de Bruijn graph requires the parameter k. Of course, we could retrieve it by the matching of the links in the GFA, but it might be better just to give it as a parameter. Apart from this, I dont think there is any difference for the user.
If the user informs panacus that the input graph is c(c)DBG, then for what it's worth, we can assume that k is equal to the length of the shortest node in the graph. That can be identified very fast, so no need for the user to specify explicitly.
This is not the case necessarily. Since it is a compacted graph you might have that all nodes are > k. The most reliable way is using links: L 3073 - 758274 + 10M L 3073 - 962680 + 10M ... Since all links will have the same matching (in this case 10M means that k is 11 for example)
But this can still be checked very fast, right? So their would still be no need for letting the user specifiy k?
Yes, but the problem is that the matching cigar is optional:
Thanks for pointing this out, @lucaparmigiani! Does Bifrost output the CIGAR string?
Yes. Bifrost outputs the cigar. I am not so sure about other tools.
If no cigar string is given, the default assumption in GFA format is that the edge is blunt. I feel like we should assume that if c(c)DBGs are given, they must provide the cigar.
But yeah, an option that specifies k in absence of the cigar won't hurt either, would it?
I agree. So if there is the parameter passed we just assume we want to parse it as a cdbg, otherwise we just base ourselves on the cigar of the first link.
It might make sense to introduce a parameter to tell panacus whether a graph is a variation graph or a De-Brujin graph. With this we could make sure all commands work for both types of graphs or at least tell the user if something won't work. Also we could change the debug/warning statements we discussed on 2024-10-15 conditional on the graph type.
Do you agree with this @lucaparmigiani? Is there anything else to think about?