Closed mkoskl closed 1 month ago
Hey @mkoskl in order to provide you a more details answer I would need to fully understand your use case first, however there are some techniques that might be sufficient
<?php
$rows = df()->read(from_json(...))->fetch(); // this returns and instance of Rows which implements ArrayAccess interface
df()
->read(from_json(...))
->batchSize(100)
->run(function(Rows $rows) {
// you will get 100 rows at once here
});
df()
->read(from_json(...))
->batchSize(100)
->forEach(function(Rows $rows) {
// you will get 100 rows at once here
});
df()
->read(from_json(...))
->batchSize(100)
->get(); // return \Generator<Rows> where each Rows instance has 100 rows.
df()
->read(from_json(...))
->getEach(); // return \Generator<Row> access one row at once
df()
->read(from_json(...))
->getEachAsArray(); // yields one row at time as an array
df()
->read(from_json(...))
->batchSize(100)
->getEachAsArray(); // yield an array of rows represented also as an array, 100 in each batch.
Not sure whether it is the recommended way to do things, but we have old self-made migration script. It's not very flexible and I'm trying to make it such.
We had an extractor for all separate (but related) source files, e.g. CSV files. Now i replaced many of those with a script that uses flow-php to read and join those files into a single parquet file.
Now I need to write the data in this result file to new system.
And our old script just loops over the data and inserts it into database of the new system.
thanks for explanation, it makes perfect sense and I belive it's a very valid approach to migrate out from handmade migration scripts into an ETL approach 👏
And our old script just loops over the data and inserts it into database of the new system.
Not sure what type of database you are using but in that case I would recommend going with https://github.com/flow-php/flow/blob/1.x/docs/components/adapters/doctrine.md
If for any reason, that will not work for you, there is another approach you can take, implement your own Loader. DbalLoader can be a good inspiration.
Please let me know in case of any issues, also feel free to join our discord server where I'm responding much faster than here, cheers!
Ok, thanks for the help! I'll close this issue now!
Hello,
how can I iterate over dataframe?
I'm trying to include
flow-php
to my data migration project.I can read my data sources, join them and store them as parquet file.
But how can I iterate over dataframe?
Br, Miika