Closed jonkeane closed 2 months ago
https://github.com/arrowrbook/book/blob/b854f06e518ef694552d025f9578e11a376fe86e/files_and_formats.qmd#L426 needs to be pretty printed and not exponentiated
In https://github.com/arrowrbook/book/blob/b854f06e518ef694552d025f9578e11a376fe86e/files_and_formats.qmd#L500 we should add a note that these work in R and Python separately / respectively. You have access to the data in the other languages, but it's not automatic (cause how would that even work???)
In https://github.com/arrowrbook/book/blob/b854f06e518ef694552d025f9578e11a376fe86e/files_and_formats.qmd#L627 something is off with that <number> bit there. If I'm reading https://parquet.apache.org/docs/file-format/data-pages/encodings/#dictionary-encoding-plain_dictionary--2-and-rle_dictionary--8 correctly, I don't know that there's one specific number, but it's a combination of number + string lengths such that if the dictionary becomes too large then it falls back to PLAIN
<number>
PLAIN
https://github.com/arrowrbook/book/blob/b854f06e518ef694552d025f9578e11a376fe86e/files_and_formats.qmd#L426 needs to be pretty printed and not exponentiated
In https://github.com/arrowrbook/book/blob/b854f06e518ef694552d025f9578e11a376fe86e/files_and_formats.qmd#L500 we should add a note that these work in R and Python separately / respectively. You have access to the data in the other languages, but it's not automatic (cause how would that even work???)
In https://github.com/arrowrbook/book/blob/b854f06e518ef694552d025f9578e11a376fe86e/files_and_formats.qmd#L627 something is off with that
<number>
bit there. If I'm reading https://parquet.apache.org/docs/file-format/data-pages/encodings/#dictionary-encoding-plain_dictionary--2-and-rle_dictionary--8 correctly, I don't know that there's one specific number, but it's a combination of number + string lengths such that if the dictionary becomes too large then it falls back toPLAIN