petl-developers / petl

Python Extract Transform and Load Tables of Data
MIT License
1.24k stars 193 forks source link

Feature Request: More examples on how to access columns in rows in lambda functions #671

Open redactedaccount opened 1 month ago

redactedaccount commented 1 month ago

Explain why petl needs this feature?

dot notation is fine, until your column name has a space in it, which for ETL operations on real databases or csvs, is most of them.

now you can just swap spaces with underscores, but it would be nice to know if there's another way

Describe the solution you would like

Examples that show other ways to access columns in rows other than row.baz, just in case you have something like Baz Products, or something.

Describe alternatives solutions you would have considered

I've tried: row['Column Name'] but you can't write to rows. this way because they are records. Having a lot of trouble making my own functions using petl.convert actually.

Source Code Examples

No response

Additional Notes

No response

Code of Conduct

bmaggard commented 1 month ago

Check out the documentation for the records method. Fields can be accessed as an attribute, by key, or by index: https://petl.readthedocs.io/en/latest/util.html#petl.util.base.records

An example in the documentation for the convert method could also be written as:

>>> # conversion can access other values from the same row
... table12 = etl.convert(table1, 'baz',
...                       lambda v, row: v * float(row['bar']),
...                       pass_row=True)

Other examples:

Accessing row as a mapping:

https://petl.readthedocs.io/en/latest/transform.html#petl.transform.basics.addfield https://petl.readthedocs.io/en/latest/transform.html#petl.transform.maps.rowmap https://petl.readthedocs.io/en/latest/transform.html#petl.transform.basics.addfields

Accessing row by index:

https://petl.readthedocs.io/en/latest/transform.html#petl.transform.maps.rowmapmany

redactedaccount commented 1 month ago

Thank you. I feel like a dummy.

Do you think you could hold my hand just a bit longer and show me how I'd alter a column in one row, conditional on two others, using a custom function? The one that gave me the most trouble was a datetime delta function (but that was because the lambda was eating the error)

Say my first column is called 'Can Cast' (bool) and the second column is called 'Spell Slots', and the third is called 'Status Effects'

Can Cast = True if Spell Slots > 0 and Status Effects not in Effects.Silence

Not actually related to anything I'm working on, just like using game metaphors.

I imagine we'd have to use pass_rows=True

Appreciate the help so far, don't feel the need to answer this one if you don't want to.

bmaggard commented 1 month ago

That's out of scope for this feature request. Feel free to post your question over on the Mailing List

ref: https://petl.readthedocs.io/en/latest/index.html#resources