ModellingWebLab / WebLab

Django-based front-end for the modelling Web Lab v2
Other
3 stars 2 forks source link

Allow a file to be marked as "main" #23

Closed helenst closed 6 years ago

helenst commented 6 years ago

One file in a model or protocol should be considered the "main" file and should be marked as such. This might change from one version to the next.

jonc125 commented 6 years ago

I'm guessing this will be marked just in the DB, not in the git repo.

MichaelClerx commented 6 years ago

Is the repo not a self-contained thing then? Sounds dangerous to have a DB keeping tabs on what's inside version control systems. Maybe something simpler like a hidden settings file in each repo? (Would allow users to fix bugs too, a bit like gits config files)

jonc125 commented 6 years ago

The DB field is the easiest approach. Probably the 'right' way to do it longer term is to include a COMBINE archive manifest file within each repo, which lists the file and indicates which is the main (master) one. But you do then have to check it's kept up-to-date if pulling commits from some other system.

MichaelClerx commented 6 years ago

So if I clone the repo, change the main file and push, how do I tell the DB about the new main file?

jonc125 commented 6 years ago

That's a more general question about how the DB tracks updates to the git repo made by someone else. @helenst, how is this managed at present? It may well be that we want only to allow pulls to the Web Lab from an external repo, rather than pushes to the Web Lab from a clone; certainly doing the auth for push access would require more work.

MichaelClerx commented 6 years ago

I'm also a bit sceptical about the idea of having bits of the web lab's functionality depend critically on COMBINE archives!

helenst commented 6 years ago

At the moment, the web interface gets that information directly from the repo, so externally pushed changes would be picked up on, and the database isn't involved in any internal details of the repo. I'd like to keep things that way if possible, even if we don't do push access now it sets things up nicely for that to happen in future. So I'd favour some kind of manifest file to keep track of which file is main.

jonc125 commented 6 years ago

OK, manifest file it is, and we may as well go with the COMBINE manifest since that's an accepted standard within the community for the information we want to record.

Regarding Michael's concerns, I can't think of anything we're likely to need to include that wouldn't fit within the standard. It specifies:

MichaelClerx commented 6 years ago

OK, I'm reading the spec and beginning to really like this idea!

Would this mean our repos are actually "gitted" rather than "zipped" versions of a combine archive?

And should we create some (super lightweight) manifest file reader/writer? I could have a go at that if needed

jonc125 commented 6 years ago

From https://combinearchive.org/software/ there's already a Python library - https://github.com/FreakyBytes/pyCombineArchive - so we should probably start by evaluating that :) - see #30.

jonc125 commented 6 years ago

Related to this, I've just realised that currently the visibility is per-entity, rather than per-version like the old Web Lab. I'm guessing the best way to fix this would be to include that info within a metadata file in the repo, since there isn't an equivalent git commit property?

jonc125 commented 6 years ago

Ah, but then we can't change the visibility of an existing version. Hum. So it's going to have to be stored somewhere else.

jonc125 commented 6 years ago

I'll create a separate issue for visibility - it ties into larger user permissions & sharing concepts.

helenst commented 6 years ago

I notice in the COMBINE spec that multiple (or no) files can be set as master. I think old weblab only makes you choose exactly one. Do we want to stick with that for the new version, or go down the "any number of master files" route?

jonc125 commented 6 years ago

Huh, I thought it had to be at most 1 in the spec! Let me check...

jonc125 commented 6 years ago

The descriptive text says "When set to "true", it indicates that the file declared by the content element should be used first when processing the content of the archive" and "In most cases, one content element per archive will have its master attribute set to true." So I think the implication is that there can be at most one - it doesn't make sense semantically otherwise.

Ah, but later it says "However, in some cases, the creator of the archive might want to set several master attribute to true, for instance to indicate alternative renditions of a model in different formats, or to bundle together several models described in the same publication. In that case the reading software are free to open a specific file, all the files tagged as master, or display a dialog box to the user offering a choice. This might create some confusion as opening the same archive several times in the same software might result in a different model loaded. One option would be to not set any master attribute to true, being aware that proper interpretation of the archive might be difficult, in particular for modular models."

I agree that this would be confusing, so think we should stick to only allowing one master file when creating manifests. If there's already a manifest with multiple masters, let's arbitrarily pick the first in cases where we need to (e.g. running an experiment, generating a readme.md for a protocol) and display on the UI that multiple files are tagged as master (I think master files are highlighted in the old weblab).

helenst commented 6 years ago

I managed to get python-libcombine to write a manifest file relatively easily as follows:

from libcombine import CombineArchive, CaWriter

archive = CombineArchive()
archive.addFile('basic_ep_model.cellml', 'basic_ep_model.cellml', 'cellml', True)
writer = CaWriter()
writer.writeOMEXToFile(archive.getManifest(), 'manifest.xml')

output:

<?xml version="1.0" encoding="UTF-8"?>
<omexManifest xmlns="http://identifiers.org/combine.specifications/omex-manifest">
  <content location="basic_ep_model.cellml" format="cellml" master="true"/>
</omexManifest>
helenst commented 6 years ago

@jonc125 I agree, that sounds a sensible approach - And if there's no master file marked, again I suppose an arbitrary choice would have to be made.

MichaelClerx commented 6 years ago

I agree that this would be confusing, so think we should stick to only allowing one master file when creating manifests. If there's already a manifest with multiple masters, let's arbitrarily pick the first in cases where we need to (e.g. running an experiment, generating a readme.md for a protocol) and display on the UI that multiple files are tagged as master (I think master files are highlighted in the old weblab).

Yes! We could go even further and just show this as an error, since there's no way we can promise the software will do what the user wants in this case?

jonc125 commented 6 years ago

That's a good point actually. The readme extraction could create an error.txt instead (which the old Web Lab then renders by default), and similarly when we run experiments.

jonc125 commented 6 years ago

This was fixed by #47, apart from the error reporting point above which I've put in #48.