biosimulations / biosimulations-physiome

Command-line application for publishing the Physiome model repository of physiological models to the BioSimulations repository for simulation projects
https://docs.biosimulations.org/repositories/physiome
0 stars 0 forks source link

Adding CombineArchiveConent with path to existing metadata file creates new file in archive #20

Closed bilalshaikh42 closed 2 years ago

bilalshaikh42 commented 2 years ago

https://github.com/biosimulations/biosimulations-physiome/blob/dev/biosimulations_physiome/process_projects.py#L261 When adding a CombineArchiveContent and providing a path to an existing metadata file (relative to source directory), writing the combine archive produces a new file called metadta_1 instead of including the existing file.

@jonrkarr Am I using the utils library incorrectly here? The metadata.rdf file is already present in the directory that is then passed to the CombineArchiveWriter, but it seems to create its own file

jonrkarr commented 2 years ago

Libcombine will create an additional metadata file if you set the metadata (author, created, updated) attributes of the combinearchive and combinearchivecontent. Basically, libcombine maps some attributes of its classes to OMEX manifest files, and some to metadata RDF files. This follows the earlier OMEX Metadata 1.0 guidelines. ​Is this what you're referring to?

I've been following the more recent 1.1 guidelines. I separately create a metadata file, and don't use any of the metadata attributes of the combinearchive and combinearchivecontent classes. I only use the attributes that map to manifest files (location, format, master).

I realize multiple aspects of this aren't 100% clear. I've brought this up multiple times with John and David. Matthias has also pushed to try to clarify the conventions for metadata.

bilalshaikh42 commented 2 years ago

That seems to be the cause, yup. I didn't realize that the underlying library was handling metadata on its own as well in addition to the methods provided by Biosimulators_Utils. I figured the metadata attributes were going to be interpreted by biosimulators_utils to add to the 1.1 spec file.

Simply removing the attributes added to the class should fix this then

bilalshaikh42 commented 2 years ago

Would it make sense then to remove these attributes from the combine archive data model in biosimulators_utils? In the class definition below for example, a user of the BioSimulators_Utils library might add the metadata as class attributes rather than through the separate data model for metadata that we have

class CombineArchiveContent(CombineArchiveBase):
    """ A content item (e.g., file) in a COMBINE/OMEX archive

    Attributes:
        location (:obj:`str`): path to the content
        format (:obj:`str`): URL for the specification of the format of the content
        master (:obj:`bool`): :obj:`True`, if the content is the "primary" content of the parent archive
        description (:obj:`str`): description
        authors (:obj:`list` of :obj:`Person`): authors
        created (:obj:`datetime.datetime`): created date
        updated (:obj:`datetime.datetime`): updated date
    """

    def __init__(self, location=None, format=None, master=False, description=None, authors=None, created=None, updated=None):
        """
        Args:
            location (:obj:`str`, optional): path to the content
            format (:obj:`str`, optional): URL for the specification of the format of the content
            master (:obj:`bool`, optional): :obj:`True`, if the content is the "primary" content of the parent archive
            description (:obj:`str`, optional): description
            authors (:obj:`list` of :obj:`Person`, optional): authors
            created (:obj:`datetime.datetime`, optional): created date
            updated (:obj:`datetime.datetime`, optional): updated date
        """
        self.location = location
        self.format = format
        self.master = master
        self.description = description
        self.authors = authors or []
        self.created = created
        self.updated = updated
jonrkarr commented 2 years ago

Will do. I agree this is an unnecessary potential source of confusion. I had implemented this before we decided how to deal with metadata. I had included this for symmetry with libCombine.