the "value" attribute from tag will be taken into account and processed as "text" in ontology
the tables will now be parsed without any ids and classes - we have different reasons behind that, for example, embeddings with ids and classes can lose some semantic value. Also, more tokens = more expensive LLM call
cleaned to_html, created to_text for OntologyElement