vgteam / vg

tools for working with genome variation graphs
https://biostars.org/tag/vg/
Other
1.08k stars 192 forks source link

external API definition (such as to support SWIG interfaces) #188

Open ekg opened 8 years ago

ekg commented 8 years ago

This is the API mega-thread. Right now there are a ton of functions (all public) in the various VG classes. This issue will resolve when we've decided which should be public and which private, and then wrapped the public ones using SWIG or Clownfish.

I can start by looking at which functions in vg::VG are used by the vg executable. This should be a reduced set compared to what's currently public.

nerdstrike commented 8 years ago

Ok, Clownfish is out as it doesn't really do C++. SWIG can be made to work, the alternative is to bring in something like Apache Thrift (or gRPC referenced in the protobuf docs) to make the code a standalone server. Nevertheless, it's still valuable to identify public and private methods and classes.

ekg commented 8 years ago

Here are functions defined in vg.hpp and called from main.cpp:

comm -23 <(cat vg.hpp | perl -ne 'print "$1\n" if / (\w+?)\(/' | sort | uniq) <(cat main.cpp | perl -ne 'print "$1\n" if / (\w+?)\(/' | sort | uniq) >vg_exports_to_main.txt

vg_exports_to_main.txt

There are many:

add_edge
add_edges
add_node
add_nodes
add_nodes_and_edges
add_start_end_markers
adjacent
align
alleles
append
build_edge_indexes
build_gcsa_index
build_indexes
build_node_indexes
clear
clear_edge_indexes
clear_edge_indexes_no_resize
clear_indexes
clear_indexes_no_resize
clear_node_indexes
clear_node_indexes_no_resize
clear_paths
collect_subgraph
combine
common_ancestor_next
common_ancestor_prev
compact_ids
concatenate
concat_mapping_groups
concat_mappings_for_node_pair
concat_mappings_for_nodes
concat_nodes
connect_nodes_to_node
connect_node_to_nodes
create_edge
create_node
create_path
create_progress
decrement_node_ids
destroy_alignable_graph
destroy_edge
destroy_node
destroy_progress
dice_nodes
disjoint_subgraphs
distance_to_head
distance_to_tail
divide_node
divide_path
edge_count
edges_end
edges_of
edges_of_node
edges_of_nodes
edges_start
edit
edit_both_directions
edit_node
empty
end_degree
ensure_breakpoints
expand_context
expand_path
extend
find_breakpoints
flip_doubly_reversed_edges
force_path_match
for_each_connected_node
for_each_edge
for_each_edge_parallel
for_each_gcsa_kmer_position_parallel
_for_each_kmer
for_each_kmer
for_each_kmer_of_node
for_each_kmer_parallel
for_each_kpath
for_each_kpath_of_node
for_each_kpath_parallel
for_each_node
for_each_node_parallel
forwardize_breakpoints
from_alleles
from_gfa
full_siblings_from
full_siblings_to
gcsa_handle_node_in_graph
get_edge
get_gcsa_kmers
get_node
has_edge
hash
has_node
head_nodes
identically_oriented_sibling_sets
include
increment_node_ids
index_edge_by_node_sides
index_paths
init
is_ancestor_next
is_ancestor_prev
is_head_node
is_tail_node
is_valid
join_heads
join_tails
keep_multinode_strongly_connected_components
keep_path
keep_paths
kmer_context
kpaths
kpaths_of_node
left_degree
length
likelihoods
mapping_is_total_match
max_node_id
merge
merge_nodes
merge_union
minmax
min_node_id
multinode_strongly_connected_components
name
next_kpaths_from_node
node_count
node_count_next
node_count_prev
node_replace_next
node_replace_prev
nodes_are_perfect_path_neighbors
NodeSide
nodes_next
nodes_prev
node_starts_in_path
NodeTraversal
nonoverlapping_node_context_without_paths
normalize
operator
orient_nodes_forward
pair_from_edge
pair_from_end_edge
pair_from_start_edge
path_edge_count
path_end_node_offset
paths_as_alignments
paths_between
path_sequence
path_string
Plan
prev_kpaths_from_node
print_edges
prune_complex
prune_complex_paths
prune_complex_with_head_tail
prune_short_subgraphs
random_read
rebuild_edge_indexes
rebuild_indexes
remove_duplicated_in
remove_node_forwarding_edges
remove_non_path
remove_null_nodes
remove_null_nodes_forwarding_edges
remove_orphan_edges
resize_indexes
reverse
right_degree
same_context
seq
serialize_to_file
serialize_to_ostream
set_edge
siblings_from
siblings_of
siblings_to
sides_context
sides_from
sides_of
sides_to
simple_components
simple_multinode_components
simplify_from_siblings
simplify_siblings
simplify_to_siblings
size
slice_alleles
sort
start_degree
strongly_connected_components
swap_node_id
swap_nodes
sync_paths
tail_nodes
tmp
to_dot
to_gfa
topological_sort
total_length_of_nodes
to_turtle
transitive_sibling_sets
unchop
unindex_edge_by_node_sides
unroll
update_progress
vcf_records_to_alleles
wrap_with_null_nodes
ekg commented 8 years ago

I should do the same for the other classes (such as the Mapper).

adamnovak commented 7 years ago

Do we want to keep this issue? Or should we make a new one (or maybe a milestone) to represent a broader internal/external API refactoring project?

nerdstrike commented 6 years ago

It is pretty old now. I'm still up for swigging at the right juncture, so cc me in on any new issue you make. I tried to get into the refactoring but it was too monumental for my spare time.