Closed devineyfajr closed 1 year ago
Hi @devineyfajr, thanks for reporting this. Do you have an example graph that reproduces the issue?
Hi @devineyfajr, thanks for reporting this. Do you have an example graph that reproduces the issue?
Unfortunately I don't.
Hi @devineyfajr , was there more to the error in the console or in debug.log ? Also, can you please provide the pipeline configuration including the configuration of the MultilayerPerceptron if any?
no more error info than that above.
CALL gds.beta.pipeline.nodeClassification.create("vt-MLP"); CALL gds.beta.pipeline.nodeClassification.selectFeatures("vt-MLP", ["degreeCentrality","embedding"]); CALL gds.beta.pipeline.nodeClassification.configureSplit("vt-MLP", {validationFolds: 10,testFraction: 0.25}); CALL gds.alpha.pipeline.nodeClassification.addMLP("vt-MLP", { hiddenLayerSizes: [256, 256], focusWeight: {range: [0.0, 1.0]}, batchSize: {range: [50, 200]}, minEpochs: {range: [1, 5]} }) YIELD parameterSpace ; CALL gds.alpha.pipeline.nodeClassification.configureAutoTuning('vt-MLP', {maxTrials: 8}); // create graph here CALL gds.beta.pipeline.nodeClassification.train('myGraph', { pipeline: 'vt-MLP', targetNodeLabels: ["__ALL__"], modelName: 'nc-MLP-model-frp', targetProperty: 'classId', metrics: ['F1_MACRO'] }) YIELD modelInfo, modelSelectionStats RETURN modelInfo, modelSelectionStats ;
Hi @devineyfajr ,
Thanks for above. It's hard to pin down the exact problem without an example to reproduce. I do have one suggestion to try out.
You've specified targetNodeLabels: ["ALL"] in train. I think this means the your graph projection used "*" for nodeLabels. So all nodes have the same label (ALL), with some properties on them.
For node classification, it is common that nodes can be labelled with at least 2 different labels. For example in https://neo4j.com/docs/graph-data-science/current/machine-learning/node-property-prediction/nodeclassification-pipelines/training/#nodeclassification-pipelines-examples-train-filtering, there are House
which have a few nodeProperties, and UnknownHouse
which also have the same nodeProperties, plus an extra class
property.
I think you might want to try:
Project your graph with different nodeLabels.
In training, specify targetNodeLabels those that have classId
as a property. (e.g House
)
In predict, specify targetNodeLabels those that don't have classId
(e.g UnknownHouse
)
Hi @devineyfajr ,
We'll close the issue for now as it could not be reproduced. Do let us know if the suggestion above fixes your problem. If not, feel free to reopen or raise a new issue.
Thanks!
gds.beta.pipeline.nodeClassification.predict.stream yields results for RandomForest and LogisticRegression models applied to same graph:
Console message: Failed to invoke procedure
gds.beta.pipeline.nodeClassification.predict.stream
: Caused by: java.lang.ArrayIndexOutOfBoundsExceptionLog message: 2023-04-03 12:38:30.398+0000 INFO [o.n.k.a.p.GlobalProcedures] [neo4j.BoltWorker-2 [bolt-95309 - /127.0.0.1:51554]] Node Classification Predict Pipeline :: Start 2023-04-03 12:38:30.398+0000 INFO [o.n.k.a.p.GlobalProcedures] [neo4j.BoltWorker-2 [bolt-95309 - /127.0.0.1:51554]] Node Classification Predict Pipeline :: Execute node property steps :: Start 2023-04-03 12:38:30.398+0000 INFO [o.n.k.a.p.GlobalProcedures] [neo4j.BoltWorker-2 [bolt-95309 - /127.0.0.1:51554]] Node Classification Predict Pipeline :: Execute node property steps :: Finished 2023-04-03 12:38:30.398+0000 INFO [o.n.k.a.p.GlobalProcedures] [neo4j.BoltWorker-2 [bolt-95309 - /127.0.0.1:51554]] Node Classification Predict Pipeline :: Node classification predict :: Start 2023-04-03 12:38:32.116+0000 INFO [o.n.k.a.p.GlobalProcedures] [gds-4] Node Classification Predict Pipeline :: Node classification predict 25% 2023-04-03 12:38:32.126+0000 INFO [o.n.k.a.p.GlobalProcedures] [gds-2] Node Classification Predict Pipeline :: Node classification predict 49% 2023-04-03 12:38:32.127+0000 INFO [o.n.k.a.p.GlobalProcedures] [gds-1] Node Classification Predict Pipeline :: Node classification predict 74% 2023-04-03 12:38:32.130+0000 INFO [o.n.k.a.p.GlobalProcedures] [gds-3] Node Classification Predict Pipeline :: Node classification predict 100% 2023-04-03 12:38:32.130+0000 INFO [o.n.k.a.p.GlobalProcedures] [neo4j.BoltWorker-2 [bolt-95309 - /127.0.0.1:51554]] Node Classification Predict Pipeline :: Node classification predict :: Finished 2023-04-03 12:38:32.130+0000 INFO [o.n.k.a.p.GlobalProcedures] [neo4j.BoltWorker-2 [bolt-95309 - /127.0.0.1:51554]] Node Classification Predict Pipeline :: Failed 2023-04-03 12:38:32.130+0000 WARN [o.n.k.a.p.GlobalProcedures] Computation failed java.lang.ArrayIndexOutOfBoundsException: null