Closed daviswer closed 6 months ago
See comments here, specifically A and B
Makes forward and generate_suffixes more efficient by fusing ops and removing repeated allocations
forward
generate_suffixes
Outputs confirmed the same up to 1e-6 error (due to very slightly different handling of LN epsilon)
Once this is landed in our PR we can mirror the changes to the TGI branch
See comments here, specifically A and B
Makes
forward
andgenerate_suffixes
more efficient by fusing ops and removing repeated allocationsOutputs confirmed the same up to 1e-6 error (due to very slightly different handling of LN epsilon)
Once this is landed in our PR we can mirror the changes to the TGI branch