bacpop / PopPUNK

PopPUNK 👨‍🎤 (POPulation Partitioning Using Nucleotide Kmers)
https://www.bacpop.org/poppunk
Apache License 2.0
88 stars 18 forks source link

Upgrades to refinement functions #175

Closed nickjcroucher closed 3 years ago

nickjcroucher commented 3 years ago

Fixes and upgrades to refinement, particularly with the individual refinement and GPU libraries. Examples of output with the SPARC dataset:

Fixes include:

johnlees commented 3 years ago

Also bump version in __init__

johnlees commented 3 years ago

Web tests failing. Let's consider changing them or removing them

nickjcroucher commented 3 years ago

Also bump version in __init__

Done in a67de16.

nickjcroucher commented 3 years ago

For future reference - the default values for C++ functions should be in python_bindings.cpp, not boundary.hpp, as otherwise the tests fail. Restored in 1f0946e.

johnlees commented 3 years ago

For future reference - the default values for C++ functions should be in python_bindings.cpp, not boundary.hpp, as otherwise the tests fail. Restored in 1f0946e.

Weird, not what I'd expect! What's the error?

nickjcroucher commented 3 years ago

For future reference - the default values for C++ functions should be in python_bindings.cpp, not boundary.hpp, as otherwise the tests fail. Restored in 1f0946e.

Weird, not what I'd expect! What's the error?

Traceback (most recent call last):
  File "/Users/ncrouche/Documents/PopPUNK/PopPUNK/test/test-refine.py", line 71, in <module>
    poppunk_refine.generateTuples([int(x) for x in assign0_res], -1, True))
TypeError: generateTuples(): incompatible function arguments. The following argument types are supported:
    1. (assignments: List[int], within_label: int, self: bool, num_ref: int, int_offset: int) -> List[Tuple[int, int]]

It does not like the input argument format it seems - guess this is a bit of an atypical call though.

johnlees commented 3 years ago

For future reference - the default values for C++ functions should be in python_bindings.cpp, not boundary.hpp, as otherwise the tests fail. Restored in 1f0946e.

Weird, not what I'd expect! What's the error?

Traceback (most recent call last):
  File "/Users/ncrouche/Documents/PopPUNK/PopPUNK/test/test-refine.py", line 71, in <module>
    poppunk_refine.generateTuples([int(x) for x in assign0_res], -1, True))
TypeError: generateTuples(): incompatible function arguments. The following argument types are supported:
    1. (assignments: List[int], within_label: int, self: bool, num_ref: int, int_offset: int) -> List[Tuple[int, int]]

It does not like the input argument format it seems - guess this is a bit of an atypical call though.

Looks like it's missing two arguments num_ref and int_offset. You can have these defaults set in the m.def part of python bindings and they appear as they defaults to the python call. C++ defaults (so if using the function from elsewhere in the C++ not just the python call) should still go in the .hpp I believe. It's ok to have both of these, as the defaults in the m.def part have a different meaning

nickjcroucher commented 3 years ago

Looks like it's missing two arguments num_ref and int_offset. You can have these defaults set in the m.def part of python bindings and they appear as they defaults to the python call. C++ defaults (so if using the function from elsewhere in the C++ not just the python call) should still go in the .hpp I believe. It's ok to have both of these, as the defaults in the m.def part have a different meaning

I don't really mind, I assumed we were prioritising only specifying the defaults once (guess if there were a difference, the C++ ones would override the python ones?). Can add the defaults back into the .hpp if useful, let me know.

johnlees commented 3 years ago

I don't really mind, I assumed we were prioritising only specifying the defaults once (guess if there were a difference, the C++ ones would override the python ones?). Can add the defaults back into the .hpp if useful, let me know.

I think I misunderstood when I saw the .cpp extension. If you've got defaults set in the m.def section only that's totally fine

nickjcroucher commented 3 years ago

I think I misunderstood when I saw the .cpp extension. If you've got defaults set in the m.def section only that's totally fine

Got it, thanks! - now we've made this decision, apologies in advance for inevitably forgetting about this discussion and putting unhelpful defaults into .cpp and/or .hpp in the future.

nickjcroucher commented 3 years ago

Fixed the web tests now - the only bit that might need investigation is the graph weights, as they are set to be true in args.txt, but are not actually stored in the graph any more. I've overridden this in the tests themselves, but if the visualisation needs the weights, we may need to rethink some of the network processing.

nickjcroucher commented 3 years ago

Passing tests now, updated tests on SPARC still seem to be working:

nickjcroucher commented 3 years ago

One more thought - should the info.py script report the strand-specific k-mer type of the database, and if so, would that change the sketchlib version requirement?

johnlees commented 3 years ago

Is this ready for re-review now?

One more thought - should the info.py script report the strand-specific k-mer type of the database, and if so, would that change the sketchlib version requirement?

Yes, that would be a useful addition. The best way to deal with these attributes which may or may not be present depending on version is to try and read them and potentially handle the error, or check for presence of the attribute name in the keys and set default/unknown if not there.

nickjcroucher commented 3 years ago

Great, added the check in 5facf69. Yes, now ready for re-review. Thanks for the very helpful comments!

johnlees commented 3 years ago

Happy to merge this when the tests have passed then!