SwissDataScienceCenter / renku-python

A Python library for the Renku collaborative data science platform.
https://renku-python.readthedocs.io/
Apache License 2.0
37 stars 29 forks source link

Renku log to work when command's input is a not commited file #1389

Open jachro opened 4 years ago

jachro commented 4 years ago

The change introducing workflows storing in JSON-LD makes the requirements for the renku command arguments more strict. More precisely, it's not possible now that an input is a file which is not committed to the project repository. As a result, the renku log command fails. However, it seems reasonable that renku log can tackle such situations as an input file could contain some sensitive data (e.g. an access token) which should never be committed.

rokroskar commented 4 years ago

Where does this situation arise? I can think of something like

renku log *

but does anyone do that?

What should happen in this case, should the command just silently ignore the files that are not in git?

jachro commented 4 years ago

Looking at the problem from the user point of view, I'd say it would be nice if renku log can simply work with inputs for non-existing files as with any other files. In the end, the file exists when he/she run the command as otherwise, the command would simply fail. From the other side, renku log could also describe somehow such inputs that they are transient or something similar. To maybe give some context. The whole problem happened for my https://kuba.dev.renku.ch/projects/renku-qa/kubas-datascience-in-bash project which uses my Strava access token put into a gitignored file as an input to renku run commands. I thought that sticking the token into a file would be the safest option as if I'd pass it as an argument to the command it would end up in the KG.

rokroskar commented 4 years ago

Oh I see, so the problem isn't only that renku log breaks if files aren't there but you want renku run to remember files that are in .gitignore and therefore renku log should return them in the KG. That is a completely different problem :)

jachro commented 4 years ago

Couldn't be said more precisely :)

On Tue, 28 Jul 2020, 13:39 Rok Roškar, notifications@github.com wrote:

Oh I see, so the problem isn't only that renku log breaks if files aren't there but you want renku run to remember files that are in .gitignore and therefore renku log should return them in the KG. That is a completely different problem :)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/SwissDataScienceCenter/renku-python/issues/1389#issuecomment-664990246, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALE4AS5PMIPYO3OQ36CKILR522HLANCNFSM4O4E3FFA .

Panaetius commented 4 years ago

This needs needs rework of graph building to be independent of git first