thortiede / sbml4j

Load SBML Files and persist them in Neo4j
MIT License
2 stars 2 forks source link

FileNode gets reused #77

Closed thortiede closed 2 years ago

thortiede commented 3 years ago

If a filename is reused in a subsequent call, there is a new pathway node created for it, but the FileNode is reused. This leads to undesired effects when searching the database of a pathway, as it is connected through the fileNode.

The fileNode needs to be unique for the database. Maybe timestamp the name, or add a database constraint (enterprise feature only).

thortiede commented 2 years ago

I have now added the md5 checksum of the file to the database node key. If the md5 sum of a file is the same as one already loaded, it will get reused, and the pathway connected to it will be returned. If the md5 sum is different we load the contained model again in a new pathway.

thortiede commented 2 years ago

This means there can be two pathways with the same name that have different content, aka the pathwayIdString is not unique, only the entityUUID is on PathwayNodes If this becomes an issue in the future please open a new issue.

This issue is closed, as the filenode is now unique for the content as described by the md5 sum.