Closed yu-iskw closed 6 years ago
You can probably write a Scio job with BQ and JDBC connectors? Also there're other tools for dumping SQL database, like sqoop which IIRC can dump MySQL tables as Avro files, which you can then load into Bigquery.
Thank you for the advice. That is what I am thinking. Apache Beam or scio with BQ and JDBC connectors would be a great another way. And dumping it as avro would be good as well. I am really glad to know that, since what I am thinking is a right direction.
@nevillelyh
I'm just curious. How does spotify transfer data from RDB like MySQL to bigquery now? I know
spark-biguqery
is maitenance mode. That is, you might get another better way for that.We centralize every data on bigquery. As well as we are still transfering data from MySQL to bigquery with scheduled jobs. As you know, transfering a huge table can be very tough without using distributed processing framework, such as spark. If you have any other better way to transfer data, would you please tell me that.
I really appriciate if you could answer my question, as far as you can tell me on github.