JohnSnowLabs / langtest

Deliver safe & effective language models
http://langtest.org/
Apache License 2.0
490 stars 38 forks source link

TypeError in Model Comparison Report Generation - '<' not supported between instances of 'PipelineModel' and 'PipelineModel #922

Closed Prikshit7766 closed 6 months ago

Prikshit7766 commented 9 months ago

Describe the bug

trained two models using spark-nlp

In model Comparison

sentence = SentenceDetector()\ .setInputCols(['document'])\ .setOutputCol('sentence')

token = Tokenizer()\ .setInputCols(['sentence'])\ .setOutputCol('token')

glove_embeddings = WordEmbeddingsModel.pretrained('glove_100d')\ .setInputCols(["document", "token"])\ .setOutputCol("embeddings")

load trained model

loaded_ner_model = NerDLModel.load("models/trained_ner_model")\ .setInputCols(["sentence", "token", "embeddings"])\ .setOutputCol("ner")

load augmented trained model

loaded_augmented_ner_model = NerDLModel.load("models/augmented_trained_ner_model")\ .setInputCols(["sentence", "token", "embeddings"])\ .setOutputCol("ner")

converter = NerConverter()\ .setInputCols(["sentence", "token", "ner"])\ .setOutputCol("ner_chunk")

ner_prediction_pipeline = Pipeline( stages = [ document, sentence, token, glove_embeddings, loaded_ner_model, converter ])

aug_ner_prediction_pipeline = Pipeline( stages = [ document, sentence, token, glove_embeddings, loaded_augmented_ner_model, converter ]) ner_model = ner_prediction_pipeline.fit(spark.createDataFrame([[""]]).toDF("text")) augmented_ner_model = aug_ner_prediction_pipeline.fit(spark.createDataFrame([[""]]).toDF("text"))


Model Comparison

models = [{"model": ner_model, "hub":"johnsnowlabs"} , {"model": augmented_ner_model, "hub": "johnsnowlabs"}]

harness = Harness(task="ner", model=models, data={"data_source":'/content/sample.conll'})


Getting the below error in the report section 

![image](https://github.com/JohnSnowLabs/langtest/assets/101416953/02b94f24-64c1-4309-aac6-a6dc9ed597c4)