Closed jonrmitchell closed 3 years ago
Hi @jonrmitchell, thanks for your contribution. Use case makes sense and quick first look at the code looks good. I have some time today in my afternoon to do a proper code review and human test. Stay tuned and thx again!
@jonrmitchell i have tomorrow a bit of of a busy day but can give your updates a spin on Tuesday morning (EST time) and come back. All looking good so far!
@DiegoPino thanks! I'm not in a big rush really, this is just a package I'd like to use on a side project (if we can get to where it meets my use cases). If it takes a little longer to get reviewed, I don't mind.
@jonrmitchell so sorry for the delay. It has been a crazy workweek (and still not finished). The code looks good and all works. I need to refresh my memory so will give the code a second read tomorrow and will then approve and merge. If you want to talk about the optional argument for remote resources to be downloaded/not downloaded, happy to give that a look. Still, to preserve current (expected functionality) we probably will want to make the default to download, but happy to be spoken out of that. I need to check the other packages and see their interfaces, its easier to make a point if there is consistency between language. Again sorry, this will be merged by tomorrow. Thanks!
No worries, thanks for taking a look.
From looking at the python library, to_zip()
does not download remote files:
https://github.com/frictionlessdata/frictionless-py/blob/main/frictionless/package.py#L587-L589
Also, from what I can tell, the python library won't move files into a temporary location for zipping either (unless explicitly listed as in-memory resources). I think it expects you to already have the data files with a path relative to the datapackage.json file https://github.com/frictionlessdata/frictionless-py/blob/main/frictionless/package.py#L617-L618 and then zipping is a simple matter of directly zipping what is on-disk.
And if this is the intent... then perhaps this entire pull request is misguided, and adding a resource with an absolute path and expecting it to "fix" the path to be relative should not work, but paths for resources need to be expected to be used and exported exactly as provided. That would explain why this functionality that I thought was so basic could be missing.
Another thing I recognized from reviewing the python library is that save()
is too simplistic and missing nuance (does it save over what's there? does it make a copy? etc) and instead having a toZip()
method would more closely explain what is going to happen when called.
In any case don't feel like you need to merge this tomorrow, let's talk this through.
But then again, if resources weren't meant to be downloaded when zipped, well that behavior wasn't working before I came along either... so I don't know.
@jonrmitchell it took me (again) some time to think about your last post and here is my take:
fopen()
in php
deals with remote sources as with files. PHP devs know that. Probably the original coder had/not had that in mind but I do not see this as an issue that should be solved by this Pull. Time to merge.This is all good work and very thankful for your patience. Let's keep in touch to make this package better =)
Sounds good. Thanks for taking the time to review it.
Overview
Update the save() method to set relative paths for resources. Clones the current package so the path changes only apply to the outputted package, which is what they do in the datapackage-js library (https://github.com/frictionlessdata/datapackage-js/blob/master/src/package.js#L266) Fixes #41.
This is only the second time I've ever submitted a pull request so if there's anything I'm doing wrong please let me know.
Please preserve this line to notify @DiegoPino (lead of this repository)