AnacletoLAB / ensmallen

🍇 Ensmallen is the Rust/Python high-performance graph processing submodule of the GRAPE library.
MIT License
39 stars 12 forks source link

G.dump_nodes doesn't respect defaults #224

Open redst4r opened 1 year ago

redst4r commented 1 year ago

Hi,

I just noticed that the default values of Graph.dump_nodes() mentioned in the docs are different than what the code acutally does. From the python docs:

> G.dump_nodes?
...
verbose: bool = True
    Wether to show a loading bar while writing to file.
separator: str = '\t'
    What separator to use while writing out to file.
header: bool = True
    Wether to write out the header of the file.
nodes_column_number: int = 0
    The column number where to write the nodes.
nodes_column: str = "id"
    The name of the column of the nodes.
node_types_column_number: int = 1
    The column number where to write the node types.
node_type_column: str = "category"
    The name of the column of the node types.

At least I assumed those are defaults specified after the parameter names.

Turns out that if you run

# Just using Hetionet as an example here
from grape.datasets.hetionet import Hetionet
G = Hetionet()
G.dump_nodes('/tmp/node.tsv')

you actually get this:

> head /tmp/nodes.tsv
node_name
Anatomy::UBERON:0000002
Anatomy::UBERON:0000004
...
...

whereas I'd expect (from the defaults):

id    category
Anatomy::UBERON:0000002   Anatomy
...

Not a big issue, but confusing behavior at first! Thanks for this great software package!!