Closed Ibrokhimsadikov closed 3 years ago
Currently, fastLink without blocking cannot handle tables with millions of rows. You would need splink (an independently developed version of fastLink for Apache Spark) or fastLink with blocking for such large linkages. I am not a fastLink developer but I routinely use fastLink with approximately 0.1 * 3.5 million rows which in practice require at least two blocks.
The developers are working on making fastLink faster. Hopefully, they can release a faster version in 2021.
Thank you, @aalexandersson for your answer, much appreciate your opinion
I tried to search for this question but could not get any performance wise answers. Could anyone suggest whether fastlink is scalable enough for tables that exceeds mln rows. Thank you