Open stoneflying opened 10 months ago
Thank you for your efforts. I found that in the NewCoreBPE function, the result of the following code does not seem to be used anywhere,
sortedTokenBytes := make([][]byte, 0, len(encoder)) for k := range encoder { sortedTokenBytes = append(sortedTokenBytes, []byte(k)) } sort.Slice(sortedTokenBytes, func(i, j int) bool { return bytes.Compare(sortedTokenBytes[i], sortedTokenBytes[j]) < 0 }) return &CoreBPE{ ...... sortedTokenBytes: sortedTokenBytes, }, nil
but this sorting operation seems to be very expensive. Is there any consideration here?
Yes, this code currently is not necessary. This variable is mainly used when multi-threaded encoding is needed, but this feature has not been implemented yet. I will implement it later, maybe this week.
Thank you for your efforts. I found that in the NewCoreBPE function, the result of the following code does not seem to be used anywhere,
but this sorting operation seems to be very expensive. Is there any consideration here?