I had recently tried building a larger model (~517 nodes) in FINN and got stuck at CreateDataflowPartition for a long time. I did some timing measurements and tracked the longest time spent down to PartitionFromDict. Since I supply a large partition dict for a large model, this step takes a long time. The changes I'd propose here keep the functionality but speed the usage of this transform up a bit:
If the node is not in the graph, the if block is never executed, but this check still runs over the whole graph for every key in the partitioning. Moving the check out before the loop enables an early return in case the node is not in the graph.
Since the index of the node in the graph doesn't change during the function execution, it does not need to be requested in every loop iteration, and so it can be moved out of the loop as well.
For smaller models and partitionings this is negligible, however it helped for my usecase and others might benefit from it as well.
I had recently tried building a larger model (~517 nodes) in FINN and got stuck at CreateDataflowPartition for a long time. I did some timing measurements and tracked the longest time spent down to PartitionFromDict. Since I supply a large partition dict for a large model, this step takes a long time. The changes I'd propose here keep the functionality but speed the usage of this transform up a bit:
If the node is not in the graph, the if block is never executed, but this check still runs over the whole graph for every key in the partitioning. Moving the check out before the loop enables an early return in case the node is not in the graph.
Since the index of the node in the graph doesn't change during the function execution, it does not need to be requested in every loop iteration, and so it can be moved out of the loop as well.
For smaller models and partitionings this is negligible, however it helped for my usecase and others might benefit from it as well.