Open BrianJKoopman opened 4 years ago
I can't think of a way that the detailed structure of a file's basename matters except for determining the file type and a minor bit of HERA-specific MJD sniffing functionality. I think it can be useful in a system like the Librarian to use extensions even for directory names, but we definitely don't want to make that a requirement and the current behavior is clearly not good.
I'm not sure if I'd recommend defaulting to a filetype of "directory" if there's no extension — it would be probably be a rarer situation but people might want to add single files whose basenames don't include extensions either. In these cases I'd lean towards just setting the filetype to the empty string.
I'm not sure if I'd recommend defaulting to a filetype of "directory" if there's no extension — it would be probably be a rarer situation but people might want to add single files whose basenames don't include extensions either. In these cases I'd lean towards just setting the filetype to the empty string.
Yeah, agreed that "directory" as the default for extensionless files doesn't sound good. The solution I came up with in the moment (https://github.com/BrianJKoopman/librarian/commit/d9ca347075e9b409c915ef85519b67efb071233a) would check extensionless files for whether they are directory before marking it as such. Another check would be needed to properly mark extensionless files (that aren't directories) with an "unknown" or empty type. I'm happy to add that with another test and PR if you'd like.
I've been exploring using the librarian for the Simons Obsevatory and have come across a couple of things in testing that I thought I'd bring up.
If I upload a directory to the librarian, and name it such that it doesn't contain a '.', then the reported type will be a string that contains the staging directory that the librarian made and the full name of my directory. For instance:
This successfully uploads, with the resulting entry:
I would have expected the type to be reported as a "directory", though see the type check is entirely dependent on the extension. Based on the test cases it seems directories uploaded by HERA might always contain some sort of extension. Is this naming a requirement for other expected behavior?