bioboxes / rfc

Request for comments on interchangeable bioinformatics containers
http://bioboxes.org
MIT License
40 stars 9 forks source link

Aid users by printing citation info #213

Open abremges opened 7 years ago

abremges commented 7 years ago

This is of rather low priority, but it would be nice if bioboxes provide citation info for the tool(s) packaged (e.g. author et al., doi) and bioboxes itself (Belmann et al., 10.1186/s13742-015-0087-0). This should be one of the last things printed when executing a biobox. Thoughts?

michaelbarton commented 7 years ago

I agree that citation info would be useful for attribution. I would suggest that we use the existing metadata system for this over printing it the std err/out. This would be more semantically accessible than having to parse strings.

Perhaps if you have some time Andreas you could research schemas and namespaces for container citatation metadata. This could be along the lines of label schema:

http://label-schema.org/rc1/

This could then generally applicable to all containerised scientific software not just bioboxes.

abremges commented 7 years ago

I see your point in having a clean and semantically accessible solution. However, from a user's perspective, it should still be presented (=printed) on-screen, otherwise it's nice but useless. In fact, a quick and dirty solution might work for now, definitely better than postponing for months – IMHO.

fungs commented 7 years ago

Wouldn't this be best solved by logging in the bioboxes command line client? Set the log level to informative by default and print the relevant information on standard error. This requires the bioboxes command line client to know the metadata, it needs to be somewhere. Otherwise, the specification does not prohibit any form of textual output, so each tool can print whatever it wants.

michaelbarton commented 7 years ago

I see your point in having a clean and semantically accessible solution. However, from a user's perspective, it should still be presented (=printed) on-screen, otherwise it's nice but useless. In fact, a quick and dirty solution might work for now

All Docker labels are stored in the JSON which can be accessed with the output from docker inspect IMAGE. I disagree that this information would be useless. Adding labels is relatively simple to add using the LABEL key to each docker file.

definitely better than postponing for months – IMHO.

Speaking for myself, I try to contribute time to bioboxes when I have it available. I don't believe it is the case that we are deliberately postponing or holding back features.

michaelbarton commented 7 years ago

Wouldn't this be best solved by logging in the bioboxes command line client? Set the log level to informative by default and print the relevant information on standard error. This requires the bioboxes command line client to know the metadata, it needs to be somewhere. Otherwise, the specification does not prohibit any form of textual output, so each tool can print whatever it wants.

I would lean towards having WARNING log level and above by default, and the user could then add a flag to signal verbose output which would be INFO level and above. I suggest this because it would be more inline with other unix tools where the -v flag is supplied.

The question would be how to implement this, especially when running bioboxes outside the command line interface using only the docker client. We could update the bioboxes RFC so that all bioboxes support a -v/--verbose flag in addition to the task entry name.

abremges commented 7 years ago

I don't believe it is the case that we are deliberately postponing or holding back features.

Please don't get me wrong, @michaelbarton; this is not what I was saying. 😉 We're all busy with other stuff that is maybe equally important but more urgent (e.g. because of externally enforced deadlines) than bioboxes.

Anyway, I briefly discussed it with @pbelmann last week, and he suggested a combination of docker inspect to extract the relevant info from the container's JSON and some logging functionality added to the CLI – if I remember (and understood) correctly –, i.e. along the lines what @fungs and you suggested.

fungs commented 7 years ago

The question would be how to implement this, especially when running bioboxes outside the command line interface using only the docker client. We could update the bioboxes RFC so that all bioboxes support a -v/--verbose flag in addition to the task entry name.

Is this a/the recommended way? If not, I wouldn't try to support it if it results in additional work.

Providing a common command line parameter for bioboxes would mean that the implementor needs to store the version somewhere in the image, which would blow up the specs further, right? Thus it means more work to build a biobox although it would be easy to pre-implement in the bbx-base image (e.g. add a file '--verbose' which will print info from /bbx/etc/version). Therefore, I think that it is the cleanest and easiest solution to let the CLI handle this.

abremges commented 7 years ago

I'd imagine that the taget audience executes bioboxes via the CLI anyway.

pbelmann commented 7 years ago

Providing a common command line parameter for bioboxes would mean that the implementor needs to store the version somewhere in the image, which would blow up the specs further, right? Thus it means more work to build a biobox although it would be easy to pre-implement in the bbx-base image (e.g. add a file '--verbose' which will print info from /bbx/etc/version). Therefore, I think that it is the cleanest and easiest solution to let the CLI handle this.

We could use docker labels to store the tool version, the docker client and the CLI could read out the information but the problem is that we would even more depend on Docker. Singularity based bioboxes as proposed by Phil Blood in his PR (https://github.com/bioboxes/bioboxes.org/pull/61) will not be able to report version numbers.