Closed Prikshit7766 closed 6 months ago
Describe the bug
trained two models using spark-nlp
In model Comparison
document = DocumentAssembler()\ .setInputCol("text")\ .setOutputCol("document")
sentence = SentenceDetector()\ .setInputCols(['document'])\ .setOutputCol('sentence')
token = Tokenizer()\ .setInputCols(['sentence'])\ .setOutputCol('token')
glove_embeddings = WordEmbeddingsModel.pretrained('glove_100d')\ .setInputCols(["document", "token"])\ .setOutputCol("embeddings")
loaded_ner_model = NerDLModel.load("models/trained_ner_model")\ .setInputCols(["sentence", "token", "embeddings"])\ .setOutputCol("ner")
loaded_augmented_ner_model = NerDLModel.load("models/augmented_trained_ner_model")\ .setInputCols(["sentence", "token", "embeddings"])\ .setOutputCol("ner")
converter = NerConverter()\ .setInputCols(["sentence", "token", "ner"])\ .setOutputCol("ner_chunk")
ner_prediction_pipeline = Pipeline( stages = [ document, sentence, token, glove_embeddings, loaded_ner_model, converter ])
aug_ner_prediction_pipeline = Pipeline( stages = [ document, sentence, token, glove_embeddings, loaded_augmented_ner_model, converter ]) ner_model = ner_prediction_pipeline.fit(spark.createDataFrame([[""]]).toDF("text")) augmented_ner_model = aug_ner_prediction_pipeline.fit(spark.createDataFrame([[""]]).toDF("text"))
Model Comparison
models = [{"model": ner_model, "hub":"johnsnowlabs"} , {"model": augmented_ner_model, "hub": "johnsnowlabs"}]
harness = Harness(task="ner", model=models, data={"data_source":'/content/sample.conll'})
Getting the below error in the report section ![image](https://github.com/JohnSnowLabs/langtest/assets/101416953/02b94f24-64c1-4309-aac6-a6dc9ed597c4)
Describe the bug
trained two models using spark-nlp
In model Comparison
sentence = SentenceDetector()\ .setInputCols(['document'])\ .setOutputCol('sentence')
token = Tokenizer()\ .setInputCols(['sentence'])\ .setOutputCol('token')
glove_embeddings = WordEmbeddingsModel.pretrained('glove_100d')\ .setInputCols(["document", "token"])\ .setOutputCol("embeddings")
load trained model
loaded_ner_model = NerDLModel.load("models/trained_ner_model")\ .setInputCols(["sentence", "token", "embeddings"])\ .setOutputCol("ner")
load augmented trained model
loaded_augmented_ner_model = NerDLModel.load("models/augmented_trained_ner_model")\ .setInputCols(["sentence", "token", "embeddings"])\ .setOutputCol("ner")
converter = NerConverter()\ .setInputCols(["sentence", "token", "ner"])\ .setOutputCol("ner_chunk")
ner_prediction_pipeline = Pipeline( stages = [ document, sentence, token, glove_embeddings, loaded_ner_model, converter ])
aug_ner_prediction_pipeline = Pipeline( stages = [ document, sentence, token, glove_embeddings, loaded_augmented_ner_model, converter ]) ner_model = ner_prediction_pipeline.fit(spark.createDataFrame([[""]]).toDF("text")) augmented_ner_model = aug_ner_prediction_pipeline.fit(spark.createDataFrame([[""]]).toDF("text"))
models = [{"model": ner_model, "hub":"johnsnowlabs"} , {"model": augmented_ner_model, "hub": "johnsnowlabs"}]
harness = Harness(task="ner", model=models, data={"data_source":'/content/sample.conll'})