Closed dcmoura closed 2 years ago
Base: 95.63% // Head: 95.63% // Increases project coverage by +0.00%
:tada:
Coverage data is based on head (
e40cba3
) compared to base (09c311b
). Patch coverage: 100.00% of modified lines in pull request are covered.:exclamation: Current head e40cba3 differs from pull request most recent head e3fe384. Consider uploading reports for the commit e3fe384 to get more accurate results
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
This PR introduces the dot operator for universal access to the fields of different files formats.
Consider the following files:
data.csv
data.json
Now we can use the same syntax to query from json and csv:
Still, the following versions are more CPU efficient and should be considered when dealing with large files, especially in the CSV case:
In the JSON case, using a lookup operator is faster than using an attribute access operator (this might be improved though). In the CSV, it envolves creating a dictionary for each row (improving this would require changing the way we read CSVs and it would always be slower than reading each line as a list - as we do today).
This functionality was already available using the
row
keyword, the dot operator just makes it more practical and reduces clutter. Under the hood,.a
is replaced byrow.a
before processing the query. This is done via regex, which is tricky (there might be some corner cases that I overlooked).I am planning on doing a silent update (with no support on the README) since this is being addressed in the new documentation that should be released soon.