astropy / astropy

Astronomy and astrophysics core library
https://www.astropy.org
BSD 3-Clause "New" or "Revised" License
4.46k stars 1.78k forks source link

Allow sub-folders in affiliated package data folder #1241

Closed cdeil closed 11 years ago

cdeil commented 11 years ago

I have started a package tevpy using the affiliated package template.

Now I'd like to include a few datasets to use in tests and examples and docs, similar to how scikit-learn or scikit-image have data sub-packages.

The issue I've run into is that when I add a sub-folder in tevpy/data the package install fails like this:

error: can't copy 'tevpy/data/poisson_stats_image': doesn't exist or not a regular file

Here's the code and the travis-ci log.

Am I doing something wrong? Would it be possible to allow sub-folders in the data folder to be able to group the data files?

cdeil commented 11 years ago

I've made a simple example here to illustrate the problem.

I'll work around the issue in tevpy I mentioned above by putting all files in the top-level folder for now, so the code link I give above no longer shows what the problem is.

astrofrog commented 11 years ago

I think that data directories usually have to be specified with the package_data option in setup.py. For example

      package_data={'astrodendro.test':['*.npz', 'benchmark_data/*.fits']},

can you specify this manually? If this doesn't work, you can also include a setup_package.py file inside e.g. tevpy and define a get_package_data function inside. See:

http://docs.astropy.org/en/stable/development/building_packaging.html?highlight=get_package_data#customizing-setup-build-for-subpackages

for more details. I'm not sure if this works for affiliated packages, but I think it should?

cdeil commented 11 years ago

@astrofrog I tried both options you mention and can't get them to work.

Can you try to make the simple example in https://github.com/astropy/package-template/pull/28 work somehow?

cdeil commented 11 years ago

I think I now got it to work and understand what is going on.

For reference:

# A dictionary to keep track of all package data to install
package_data = {PACKAGENAME: ['data/*']}

if you add a sub-folder in data you have to change this list so that it only contains files.

embray commented 11 years ago

This can just be changed to:

package_data = {PACKAGENAME: ['data/*.*', 'data/poisson_stats_image/*.*']}

or even just

package_data = {PACKAGENAME: ['data/*.*', 'data/*/*.*']}

but otherwise just using a wildcard won't recurse subdirectories because essentially distutils is calling shutil.copy2() and *not shutil.copytree() on each wildcard match.

cdeil commented 11 years ago

@iguananaut I think I've now run into a real issue with get_pkg_data_filename ... but I'm not 100% sure. I get this error with Python 3, but not Python 2:


In [2]: tevpy.data.poisson_stats_image()
ERROR: AttributeError: 'NoneType' object has no attribute 'split' [astropy.utils.data]
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-c050b9fcebb1> in <module>()
----> 1 tevpy.data.poisson_stats_image()

/Users/deil/Library/Python/3.2/lib/python/site-packages/tevpy-0.1-py3.2-macosx-10.8-x86_64.egg/tevpy/data/__init__.py in poisson_stats_image(extra_info)
     70     else:
     71         filename = 'poisson_stats_image/counts.fits.gz'
---> 72         out = fits.getdata(get_pkg_data_filename(filename))
     73 
     74     return out

/opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/astropy/utils/data.py in get_pkg_data_filename(data_name)
    451             return hashfn
    452     else:
--> 453         datafn = _find_pkg_data_path(data_name)
    454         if os.path.isdir(datafn):
    455             raise IOError("Tried to access a data file that's actually "

/opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/astropy/utils/data.py in _find_pkg_data_path(data_name)
    691         # not called from inside an astropy package.  So just pass name through
    692         return data_name
--> 693     rootpkgname = module.__package__.split('.')[0]
    694 
    695     rootpkg = __import__(rootpkgname)

AttributeError: 'NoneType' object has no attribute 'split'

Am I doing something wrong? https://github.com/gammapy/tevpy/blob/master/tevpy/data/__init__.py#L72

embray commented 11 years ago

I don't think you're doing anything wrong. I didn't write that function but I'm not sure that module.__package__ can be relied on the way it's being used here. I'll have to take a closer look.

cdeil commented 11 years ago

I've filed a separate issue.