uber / petastorm

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Apache License 2.0
1.78k stars 285 forks source link

Add a build in CI for pyspark 3.0 #521

Closed liangz1 closed 4 years ago

liangz1 commented 4 years ago

The pypi package is not released yet. Once it is available, I'll update the code.

codecov[bot] commented 4 years ago

Codecov Report

Merging #521 into master will increase coverage by 0.05%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #521      +/-   ##
==========================================
+ Coverage   86.18%   86.24%   +0.05%     
==========================================
  Files          81       81              
  Lines        4467     4471       +4     
  Branches      717      718       +1     
==========================================
+ Hits         3850     3856       +6     
+ Misses        505      503       -2     
  Partials      112      112              
Impacted Files Coverage Δ
petastorm/unischema.py 95.79% <100.00%> (+1.03%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 4df17f3...2cab881. Read the comment docs.

WeichenXu123 commented 4 years ago

@liangz1 I build latest master pyspark and upload to S3. Use pip install pip install https://ml-team-public-read.s3-us-west-2.amazonaws.com/pyspark-3.1.0.dev0-60dd1a690fed62b1d6442cdc8cf3f89ef4304d5a.tar.gz to install latest spark.

WeichenXu123 commented 4 years ago

Summary: