Open Sozialarchiv opened 4 years ago
This issue is maybe related to #36
The user friendly comment In a default conversion one gets something like this:
<https://iisg.amsterdam/mybase/405cbee5590602b3d786d315219350543d25148f> <https://iisg.amsterdam/mybase/vocab/path> "/home/path/to/at-list.csv"^^<http://www.w3.org/2001/XMLSchema#string> <urn:uuid:7ece5d83-d53a-49bc-ae54-bfef5ed0b09a> .
But the question becomes, what should it become? For example, this would look weird:
<https://iisg.amsterdam/mybase/405cbee5590602b3d786d315219350543d25148f> <https://iisg.amsterdam/mybase/vocab/path> "at-list.csv"^^<http://www.w3.org/2001/XMLSchema#string> <urn:uuid:7ece5d83-d53a-49bc-ae54-bfef5ed0b09a> .
That's simply the name of the file, not a path. Relative paths are also not really an option as they have the same dangers as absolute paths.
The dev comment I took a dive in the code, there needs to be some discussion on the provenance graph specifically.
# A URI that represents the version of the file being converted
self.dataset_version_uri = SDR[source_hash]
self.add((self.dataset_version_uri, SDV['path'], Literal(file_name, datatype=XSD.string)))
self.add((self.dataset_version_uri, SDV['sha1_hash'], Literal(source_hash, datatype=XSD.string)))
# ----
# The nanopublication graph
# ----
name = (os.path.basename(file_name)).split('.')[0]
self.uri = SDR[f"{name}/nanopublication/{hash_part}"]
Note: file_name
is the full path (for legacy reasons).
A possible change would be to move name
above the line with SDV['path']
. The issue is though that privacy sensitive information may still be disclosed if a relative path is shown instead of an absolute one. The only save thing to do here is to include the name of the file only, without any path information. But that does change the semantics of SDV['path']
. Therefore, a discussion within Clariah is needed.
Is there a option to avoid that the full filepath (e.g. only basename) is added automatically to the output.
In some cases this can be a privacy issue. (the path can contain a username for example)