Open knguyen1 opened 6 months ago
thanks for reporting this. if you are struck, you can try reading the model folder at zinggDir/modelId/trainingData/marked using pyspark. this location will have your labeled data in parquet format
Will be handled along side SparkConnect change, putting on hold for now
For anyone who just wants to get their training data:
MODEL_PATH: str = "{your model folder}/{your model ID}"
OUTPUT_PATH: str = "output.csv"
from pathlib import Path
from pyspark.sql import SparkSession
context: SparkSession = SparkSession.builder.getOrCreate()
context.sparkContext.getConf().getAll()
df = context.read.parquet(str((Path(MODEL_PATH) / "trainingData/marked").absolute()))
print(df.toPandas())
# Save to CSV
df.toPandas().to_csv(Path(OUTPUT_PATH), header=True, index=False)
same null pointer error on zingg:0.4.0 from docker img
For anyone who just wants to get their training data:
MODEL_PATH: str = "{your model folder}/{your model ID}" OUTPUT_PATH: str = "output.csv" from pathlib import Path from pyspark.sql import SparkSession context: SparkSession = SparkSession.builder.getOrCreate() context.sparkContext.getConf().getAll() df = context.read.parquet(str((Path(MODEL_PATH) / "trainingData/marked").absolute())) print(df.toPandas()) # Save to CSV df.toPandas().to_csv(Path(OUTPUT_PATH), header=True, index=False)
Thanks havardox.
I'm running zingg from docker and new to spark. Wondering how can I export the model from docker?
Can you try running pyspark in the docker and the commands shared above by @havardox
Describe the bug Cannot generate csv of model because of
NullPointerException
. PhasegenerateDocs
works just fine. From documentation: https://docs.zingg.ai/zingg/stepbystep/createtrainingdata/exportlabeleddataTo Reproduce Steps to reproduce the behavior: Run:
(.venv) spark@496208741a60:/workspaces/foo-zingg-entity-resolution $ ~/zingg-0.4.0/scripts/zingg.sh --phase exportModel --conf /workspaces/foo-zingg-entity-resolution/datasets/trader/conf_no_bdid.json --location tmp --properties-file /workspaces/foo-zingg-entity-resolution/zingg.conf
Expected behavior Should be able to export a csv of the model.
Screenshots
Desktop (please complete the following information):
Smartphone (please complete the following information): N/A
Additional context