issues
search
pydiverse
/
pydiverse.pipedag
A data pipeline orchestration library for rapid iterative development with automatic cache invalidation allowing users to focus writing their tasks in pandas, polars, sqlalchemy, ibis, and alike.
https://pydiversepipedag.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
15
stars
2
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Local file caching for dataframe/eager tasks
#73
windiana42
closed
1 year ago
7
implemented ignore_task_version option
#72
windiana42
closed
1 year ago
0
Add option that disables explicit version checking for some instances
#71
windiana42
closed
1 year ago
0
fix ibm_db_sa bug when copying dataframes from cache: uppercase table names by default
#70
windiana42
closed
1 year ago
2
Pandas <-> SQL type mapping and chunksize=100k
#69
windiana42
closed
1 year ago
1
Add polars and tidypolars support for pipedag
#68
windiana42
closed
1 year ago
1
Add support for IBIS as a way to programmatically create SQL (and tests)
#67
windiana42
closed
1 year ago
1
Use "select *" where possible instead of writing out all columns of a table
#66
windiana42
closed
1 year ago
0
Increase tasks.output_json length
#65
windiana42
closed
1 year ago
1
Integration tests with various pandas and sqlalchemy versions
#64
windiana42
closed
1 year ago
1
Set isolation level for mssql and DB2
#63
windiana42
opened
1 year ago
0
Improve pessimistic lazy caching
#62
windiana42
opened
1 year ago
5
closes #57: replace format_exception with format_exc
#61
windiana42
closed
1 year ago
1
Temp fix CI by running DB2 tests separately with only one concurrent configuration
#60
windiana42
closed
1 year ago
1
Test _trigger_deferred_table_store_ops branch `started_ops_end = max(...)+1`
#59
windiana42
closed
1 year ago
1
Make sure multiple scope exceptions are transported through remote procedure call
#58
windiana42
closed
1 year ago
2
Really support python 3.9
#57
windiana42
closed
1 year ago
2
Support pandas 2.0 + polars with apache arrow backed dataframes
#56
windiana42
closed
1 year ago
2
New Feature: implement local file cache (i.e. parquet files) for tasks working on dataframes
#55
windiana42
closed
1 year ago
1
Don't copy any data in case of 100% cache valid stages
#54
windiana42
closed
1 year ago
3
Fail tests if they emit logger.error()
#53
windiana42
opened
1 year ago
0
Limit context output for stack trace output of errors in IPCWorker/IPCServer scenario
#52
windiana42
opened
1 year ago
1
Can we put an authentication token in communication between tasks and RunConfigServer?
#51
windiana42
closed
1 year ago
2
Changing output table name should invalidate cache or should be accounted for in cache-copying
#50
windiana42
closed
1 year ago
0
Common case fast: If all tasks in a stage are cache valid, then there should not be spent time to write data to a new schema
#49
windiana42
closed
1 year ago
0
Better error message when sql is not wrapped in sa.text()
#48
windiana42
opened
1 year ago
0
Fix structlog Warning "Remove format_exc_info from your processor chain"
#47
windiana42
closed
1 year ago
1
Speed up performance for DB2
#46
windiana42
closed
1 year ago
0
Reduce dialect specific name mangling code duplication
#45
windiana42
closed
1 year ago
3
Fix primary key / index with same column on different tables within schema
#44
windiana42
closed
1 year ago
0
Move config module outside of util / generally optimize public interface of pipedag
#43
windiana42
closed
1 year ago
4
Better error message when indexes is not list of list
#42
windiana42
closed
1 year ago
1
Optimize Speed for DB2
#41
windiana42
opened
1 year ago
1
Implemented #39: option avoid_drop_create_schema
#40
windiana42
closed
1 year ago
1
Support for DB2 environment where CREATE SCHEMA is not permitted
#39
windiana42
closed
1 year ago
0
Ipython / jupyter runs of pipedag (i.e. Interactive Window in VS Code)
#38
windiana42
closed
1 year ago
1
add some tests for tasks using pydiverse.transform
#37
windiana42
closed
1 year ago
5
Prepare release 0.2.0 by improving README files for pypi
#36
windiana42
closed
1 year ago
1
Would it be feasible to auto-detect input_type parameter to @materialize in simple cases?
#35
windiana42
opened
1 year ago
2
fail_fast: false still swallows exceptions
#34
windiana42
closed
1 year ago
1
Create Primary Keys and Indexes
#33
windiana42
closed
1 year ago
1
Extract cache management functionality out of core.py
#32
windiana42
closed
1 year ago
2
Add setuptools as dev-dependency to enable PyCharm to run tests.
#31
windiana42
closed
1 year ago
0
Expand tests to also test cache validity after cache invalidation.
#30
windiana42
closed
1 year ago
0
Sql refactoring
#29
NMAC427
closed
1 year ago
0
Speed up and expand testing
#28
NMAC427
closed
1 year ago
1
Add ruff as Linter
#27
NMAC427
closed
1 year ago
0
Added comments + slight encoding changes.
#26
windiana42
closed
1 year ago
0
We still seem to have some problems with '\n' in error message forwarding within flow.run()
#25
windiana42
closed
1 year ago
5
Fix cache behaviour (see #22, #23)
#24
NMAC427
closed
1 year ago
0
Previous
Next