AuvaLab / itext2kg

Incremental Knowledge Graphs Constructor Using Large Language Models
GNU Lesser General Public License v2.1
467 stars 41 forks source link

"AttributeError: 'NoneType' object has no attribute 'keys' " when resolving entities #17

Open GiancarloAllasia opened 3 hours ago

GiancarloAllasia commented 3 hours ago

Way too often when I run the build_graph() method I get stuck into this error:

{
    "name": "AttributeError",
    "message": "'NoneType' object has no attribute 'keys'",
    "stack": "---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[30], line 1
----> 1 global_ent, global_rel = itext2kg.build_graph(
      2     sections=semantic_blocks_cv, 
      3     ent_threshold=0.2, 
      4     rel_threshold=0.2
      5 )

File ~/Documents/Code/KnowledgeGraph/itext2kg/.venv/lib/python3.10/site-packages/itext2kg/graph_integration/itext2kg.py:137, in iText2KG.build_graph(self, sections, existing_global_entities, existing_global_relationships, ent_threshold, rel_threshold)
    135 for i in range(1, len(sections)):
    136     print(\"[INFO] Extracting Entities from the Document\", i+1)
--> 137     entities = self.ientities_extractor.extract_entities(context= sections[i])
    139     processed_entities, global_entities = self.matcher.process_lists(list1 = entities, list2=global_entities, for_entity_or_relation=\"entity\", threshold=ent_threshold)
    141     #relationships = relationship_extraction(context= sections[i], entities=list(map(lambda w:w[\"name\"], processed_entities)))

File ~/Documents/Code/KnowledgeGraph/itext2kg/.venv/lib/python3.10/site-packages/itext2kg/ientities_extraction/ientities_extractor.py:40, in iEntitiesExtractor.extract_entities(self, context, embeddings, property_name, entity_name_key)
     36 entities = self.langchain_output_parser.extract_information_as_json_for_context(context=context, output_data_structure=EntitiesExtractor)
     37 print(entities)
---> 40 if \"entities\" not in entities.keys() or entities == None:
     41     print(\"Not formatted in the desired format, we are retrying ....\")
     42     self.extract_entities(context=context, entities=entities, embeddings=embeddings, property_name=property_name, entity_name_key=entity_name_key)

AttributeError: 'NoneType' object has no attribute 'keys'"
}

I think there is not properly handled the case when the json parser from Langchain produces a null output, but I'm not sure. Is very difficult to make it converge even with the tutorial examples of CVs. I'm using Ollama with Llama 3.1.

lairgiyassir commented 3 hours ago

Hello,

I have already refactored all the iText2Kg code in the 0.0.7 version.

Please try to install the recent version :

pip install —upgrade itext2kg

lairgiyassir commented 3 hours ago

Are you using llama 3.1 8b? It is really better to upgrade it to 70b, otherwise it will be hard for that model to structure the output.

We had same problem in #6.