Closed roll closed 4 years ago
Hey @akariv @cschloer
It's an initial POC for the feature. It uses a special character #
to address row numbers starting from 1 on the first row (excluding headers).
population.csv
id,population
1,8
2,2
4,3
cities_comments.csv
city,comment
paris,city with population in row 2
london,city with population in row 1
rome,city with population in row 3
def test_join_row_number_format_string():
from dataflows import load, set_type, join
flow = Flow(
load('data/population.csv'),
load('data/cities_comment.csv'),
join(
source_name='population',
source_key='city with population in row {#}',
target_name='cities_comment',
target_key='{comment}',
fields={'population': {'name': 'population'}}
),
)
data = flow.results()[0]
assert data == [[
{'city': 'paris', 'population': 2, 'comment': 'city with population in row 2'},
{'city': 'london', 'population': 8, 'comment': 'city with population in row 1'},
{'city': 'rome', 'population': 3, 'comment': 'city with population in row 3'},
]]
Totals | |
---|---|
Change from base Build 458: | 0.0% |
Covered Lines: | 1749 |
Relevant Lines: | 2049 |
Hey, just ran this through a pipeline and it works great. I am able to do a horizontal concatenate by just setting the source_key to ['#'] and the target key to ['#']
@akariv Please please take a look :smiley:
Hey, this looks good - just update the documentation for this new option :)
Hi @akariv,
The docs are done and the PR is ready for a review
Tests fail because of https://github.com/frictionlessdata/tabulator-py/issues/309 (fixed in
tabulator@1.38.3
)