openvenues / libpostal

A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.
MIT License
4.04k stars 417 forks source link

set uid/gid on data file tarballs to 0 #267

Closed aviks closed 6 years ago

aviks commented 6 years ago

I've been trying to build libpostal on a limited, containerised environment. On trying to extract the data files, tar fails with

tar: address_expansions/address_dictionary.dat: Cannot change ownership to uid 2000, gid 2000: Invalid argument
numex/
tar: address_expansions: Cannot change ownership to uid 2000, gid 2000: Invalid argument
numex/numex.dat
tar: numex/numex.dat: Cannot change ownership to uid 2000, gid 2000: Invalid argument
transliteration/
tar: numex: Cannot change ownership to uid 2000, gid 2000: Invalid argument
transliteration/transliteration.dat
tar: transliteration/transliteration.dat: Cannot change ownership to uid 2000, gid 2000: Invalid argument
tar: transliteration: Cannot change ownership to uid 2000, gid 2000: Invalid argument
tar: Exiting with failure status due to previous errors
make[2]: *** [Makefile:4170: all-local] Error 2

I believe the general convention for unix tarball releases is to set uid/gid to 0. Doing that makes the releases compatible with a wider set of environments. I think tar cf .... --owner=0 --group=0 will do that.

Would you be open to making that change?

albarrentine commented 6 years ago

Hm, tar on Mac appears not to support --owner=0, and I'd prefer that both create/extract work the same way on both Mac and Linux.

Should be possible at extraction time though with the --no-same-owner option (usually that's default, but might not be in the case of a limited environment). I've added that option explicitly, which should fix your issue. Let me know.

aviks commented 6 years ago

Yes, that sounds reasonable.

albarrentine commented 6 years ago

This should be implemented in master as of https://github.com/openvenues/libpostal/commit/669e52b329017348af81af2f40292f56f85d5de3. Let me know if it's still not working.