dvolgyes / zenodo_get

Zenodo_get: Downloader for Zenodo records
GNU Affero General Public License v3.0
130 stars 21 forks source link

Use inside python #3

Open mpariente opened 4 years ago

mpariente commented 4 years ago

So the CLI is great ! Is there a simple way to use this from inside Python? That would be very useful !

Thanks,

dvolgyes commented 4 years ago

Well, in a way, yes, in a way, no. You can use it from python, probably i need to export one more function, but otherwise yes.

Basically the whole source code is just that: https://github.com/dvolgyes/zenodo_get/blob/master/zenodo_get/__main__.py

There are two ways to use it: simulate command line arguments. i will export the main function, and afterwards it should be like: from zenodo_get import zenodo_get zenodo_get(['param1','param2',....])

But this one also gives all the visual feedbacks (printing, etc.) Or you could take apart the above mentioned file. The problem is that all the parts are quite small, so making a proper API is a bit overkill, you end up that you could use zenodo API in the first place.

Probably there might be a compromise, e.g. suppressing the outputs, and using exception instead of sys.exit().

How would you use it, which features, etc. would you use? How would you process errors? What about feedback? (callbacks are tricky, i don't want asyncio/threads, and if there is no display feedback, then what do you do when the file is e.g. several GB and takes dozens of minutes to download)

I did not spend much time to write it, maybe a day, i just wrote it for myself. So i usually use command line, or i just it into a bash script, and i loop over the values. So my questions above are genuine, and i am not against improving, but since it covers my use case, i need input what would be the other use cases.

mpariente commented 4 years ago

Thanks for your answer.

What I want to do is to query a set of records from a community based on keywords, then I'll obtain the ids and download them. There are several things involved and most of them happen in Python, so I'd like to keep the download in Python as well.

If you don't mind, I can use pieces of your code for my own use, and I'll reference you're repo in my repo. Would this be ok for you? I think this will be easier than rewriting things to suit my use cases (which are kind of undefined for now), while keeping yours.

dvolgyes commented 4 years ago

Of course, you can modify, reuse part is it, etc., that is the goal of free software. I only ask one thing: don't use the same name, so it would not lead to confusion. It doesn't matter if it is similar, e.g. zget, or zenodo_get_ng, just not the same. But otherwise feel free to use it in any way.

Side remark: the code was rushed, i needed something quickly, and when it worked, i more or less just left it as it is. Therefore, it is undocumented and ugly, but at least it is not long, so it is not too hard to understand, and it works.

Implementation remark: i tried to deal with two options: timeout/broken download, so the already downloaded files should not be downloaded again, and sometimes zenodo gives timeout/access error. So when you writte your own, consider these two issues. :)

dvolgyes commented 4 years ago

Partially helpful, but i pushed a new version (not on pypi, but you can install from git), and this one exports the zenodo_get function. import zenodo_get zenodo_get.zenodo_get(['cli_param1','cli_param2',...])

If an error occours, it should throw an exception. It doesn't suppress any message at this point.

There is one bug, i don't understand how: if you call it with -h, then the parser prints the correct message, AND throws an exception, but it shouldn't. But i guess from python code you would use the other features.

It is more or less untested, but it seemed to work.

mpariente commented 4 years ago

Yes, looks like it works and I clone and install. Didn't look like it works if I use this though pip3 install git+https://gitlab.com/dvolgyes/zenodo_get The syntax is weird for the arguments but at least it works, thanks !

Would it be possible to add an output directory argument to download in a specified directory? It would be even cooled

dvolgyes commented 4 years ago

Hi,

yes, i will add it tomorrow, i am out for the afternoon. And of course, it is just a workaround, but it literally supports everything what CLI supports. :)

I will make some helper functions, or probaby wrap the whole thing into a class or a function with named parameters. (Should have done in the first place, but as i said: time took priority over design.)

Pip install: that is strange, i will look into it.

-------- Original Message -------- On May 23, 2020, 14:05, Pariente Manuel wrote:

Yes, looks like it works and I clone and install. Didn't look like it works if I use this though pip3 install git+https://gitlab.com/dvolgyes/zenodo_get The syntax is weird for the arguments but at least it works, thanks !

Would it be possible to add an output directory argument to download in a specified directory? It would be even cooled

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

dvolgyes commented 4 years ago

I added the directory option (-o / --output-dir ), but the refactoring is still future plan. If the directory doesn't exist, it creates it, and also including missing intermediate dirs. If exists, then it will use it. It doesn't check if the directory is empty or not.

aburrell commented 1 year ago

If you end up refactoring at some point in the future, it would be useful to have the downloaded files as a returned variable or class attribute. Thanks for creating and maintaining this package!