paradigmxyz / cryo

cryo is the easiest way to extract blockchain data to parquet, csv, json, or python dataframes
Apache License 2.0
1.12k stars 97 forks source link

add empty chunk handling, importing python crate dependencies, tx filtering by address, support for new datatypes in python crate #147

Open davidthegardens opened 9 months ago

davidthegardens commented 9 months ago

Motivation

  1. Closes #137, also pip did not install polars and some other libraries, this was also fixed.
  2. Added filtering transactions by --address, which acts as an OR over filtering by --to-address and --from-address.
  3. Adding the argument --write-empty to prevent from writing empty dataframes to disk
  4. Added support for the new datatypes in the python crate

Solution

  1. added the packages and versions into the build file
  2. created a closure that copies the conditions of to and from filtering, adding the from-address into env.execution, and creating a global vector to log the tx hashes that have been moved (in order to prevent duplicates that arise from async)
  3. added support for the flag, and added a couple lines that checks the dataframes shape. If shape is 0 and --write-empty is false, then it will continue rather than write to disk.
  4. allowed the python crate to take in the new datatypes (like transaction), I can't take much of the credit, this was mostly already supported.

PR Checklist

sslivkoff commented 8 months ago

heyo. everything is looking great on the pr. the last thing is get_addresses(). does request.addresses() have everything you need to avoid the get_addresses() function?

davidthegardens commented 8 months ago

heyo. everything is looking great on the pr. the last thing is get_addresses(). does request.addresses() have everything you need to avoid the get_addresses() function?

Hey, thank you! So request.addresses() doesn't seem to exist and request.address doesn't have all the information needed unfortunately.