uber / petastorm

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Apache License 2.0
1.78k stars 285 forks source link

Import petastorm.spark in init #549

Closed praateekmahajan closed 4 years ago

praateekmahajan commented 4 years ago

Issue

  1. I install petastorm pip install petastorm
  2. Run from petastorm.spark import X, Y, Z and I get the error
    ModuleNotFoundError : No module named 'petastorm.spark'

    Fix

    • Added import petastorm.spark in init
    • Updated the version so pip builds happen.

I'm not sure if this is the right way, since there are nuances about releases and how petastorm currently does python module management.

praateekmahajan commented 4 years ago

Thanks @selitvin that makes sense. The only reason I created this PR, I thought it's faster than an issue, as it shows the requested code change. However that might have not been a good decision, so I can move it to an issue if you want.

I updated my PR to reflect what the issue is, which is

  1. I install petastorm pip install petastorm
  2. Run from petastorm.spark import X, Y, Z and I get the error
    ModuleNotFoundError : No module named 'petastorm.spark'
selitvin commented 4 years ago

Petastorm does not have a spark subpackage (there is new petastorm.spark package in the next release, but not in the existing public one). What were you trying to import?

praateekmahajan commented 4 years ago

Oh I see, the subpackage/release part makes sense. I can close the PR and wait out the next release. Out of curiosity where are subpackages defined?

I was trying to do the following (from the example)

from petastorm.spark import SparkDatasetConverter, make_spark_converter.

Abhishaike commented 4 years ago

Hey @praateekmahajan, running import petastorm.spark also gives me the error

ModuleNotFoundError : No module named 'petastorm.spark'

Not quite understanding your fix, any advice?