Closed meg261995 closed 1 year ago
Hello @meg261995 and thanks for reaching out to us!
Are you attempting to use the gds.alpha.ml.oneHotEncoding()
function?
If yes, then you must provide two inputs: the dictionary of tokens/words, and then ask for the encoding of a subsequence of items from the dictionary.
For example:
WITH ['eu', 'en', 'bg', 'ca'] AS dictionary
WITH
gds.alpha.ml.oneHotEncoding(dictionary, ['en', 'eu']) AS eneu,
gds.alpha.ml.oneHotEncoding(dictionary, ['bg', 'eu']) AS bgeu
RETURN eneu, bgeu
which returns
╒═════════╤═════════╕
│"eneu" │"bgeu" │
╞═════════╪═════════╡
│[1,1,0,0]│[1,0,1,0]│
└─────────┴─────────┘
See also https://neo4j.com/docs/graph-data-science/current/alpha-algorithms/one-hot-encoding/
All the best Mats
If you are in need of more general Cypher help, I recommend making use of
@meg261995 Could Mats response solve your issue?
I am closing the issue due to inactivity. Feel free to reopen it :)
Suppose I have values something like this:
1) eu 2) en 3) en,eu 4) bg 5) ca 6) bg,eu
I want the one hot encoding to look like this:
1) 100000 2) 010000 3) 110000 4) 000100 5) 000010 6) 100100
How do I achieve this in cypher? I do not want 3 and 6 to be considered as separate classes, its possible in python, but I'm a beginner in cypher