Open Fuyujia799 opened 3 days ago
Hi, I noticed a potential bug in the chunking function inside src/agents/outline_writer.py.
chunking
src/agents/outline_writer.py
Here is the relevant part of the code:
def chunking(self, papers, titles, chunk_size=14000): paper_chunks, title_chunks = [], [] total_length = self.token_counter.num_tokens_from_list_string(papers) num_of_chunks = int(total_length / chunk_size) + 1 avg_len = int(total_length / num_of_chunks) + 1 split_points = [] l = 0 for j in range(len(papers)): l += self.token_counter.num_tokens_from_string(papers[j]) if l > avg_len: l = 0 split_points.append(j) continue start = 0 for point in split_points: paper_chunks.append(papers[start:point]) title_chunks.append(titles[start:point]) start = point paper_chunks.append(papers[start:]) title_chunks.append(papers[start:]) return paper_chunks, title_chunks
In the second-to-last line:
title_chunks.append(papers[start:])
I think it should be:
title_chunks.append(titles[start:])
Otherwise, the title_chunks list seems to end up containing chunks from papers instead of titles, which might be incorrect.
title_chunks
papers
titles
Please verify this and let me know if it needs to be fixed.
Thanks!
Thanks for pointing it out! We've fixed the bug now :)
Hi, I noticed a potential bug in the
chunking
function insidesrc/agents/outline_writer.py
.Here is the relevant part of the code:
In the second-to-last line:
I think it should be:
Otherwise, the
title_chunks
list seems to end up containing chunks frompapers
instead oftitles
, which might be incorrect.Please verify this and let me know if it needs to be fixed.
Thanks!