I get same origin y values and bbox values for obviously different spans

This is not a bug! You let yourself be confused by the fact that all page are rotated by 90 degrees. But - as document - extracted coordinates are always relative to the unrotated page. So lines (or spans) are roughly speaking "columns" etc. You can remove rotation from pages before extraction to get more canonical results. BTW page.clean_contents() is unnecessary here and only costs time. E.g.:

import pymupdf
from pymupdf import TEXT_PRESERVE_WHITESPACE, TEXT_PRESERVE_SPANS, TEXT_MEDIABOX_CLIP
from pathlib import Path

p = Path("Tora.pdf")
pdf_document = pymupdf.open(p, filetype=".pdf")
flags = TEXT_PRESERVE_WHITESPACE | TEXT_PRESERVE_SPANS | TEXT_MEDIABOX_CLIP
page = pdf_document[1]
page.remove_rotation()
spans = [
    s
    for b in page.get_text("dict", flags=flags)["blocks"]
    for l in b["lines"]
    for s in l["spans"]
]
spans.sort(key=lambda s: (s["origin"][1],s["origin"][0]))
for s in spans:
    print(f'{s["origin"]=}, {s["text"]=}')

Delivers this result:

s["origin"]=(71.43280029296875, 99.65480041503906), s["text"]='Dear readers!'
s["origin"]=(71.43280029296875, 150.05499267578125), s["text"]='In the book “The Torah tells me“, we will read about the creation, our forefathers; '
s["origin"]=(71.43280029296875, 166.85499572753906), s["text"]='Abraham, Isaac and Jacob, the alliance between G-D and the Jewish nation and about '
s["origin"]=(71.43280029296875, 183.65499877929688), s["text"]='the birth of the twelve tribes of Israel.'
s["origin"]=(71.43280029296875, 217.2550048828125), s["text"]='We will conclude this volume with reading about Joseph becoming the viceroy of '
s["origin"]=(71.43280029296875, 234.0550079345703), s["text"]='Pharaoh, save Egypt from the 7 years of hunger and got back to his brothers.'
s["origin"]=(71.43280029296875, 267.6549987792969), s["text"]='The Torah is not only a book with nice stories, Torah it’s a lesson for life. The word '
s["origin"]=(71.43280029296875, 284.4549865722656), s["text"]='“Torah” in Hebrew means direction, G-d gave us the Torah to direct us through our '
s["origin"]=(71.43280029296875, 301.2549743652344), s["text"]='life. '
s["origin"]=(71.43280029296875, 334.85498046875), s["text"]='It is signi'
s["origin"]=(123.07740020751953, 334.85498046875), s["text"]='fi'
s["origin"]=(126.9708023071289, 334.85498046875), s["text"]=' cant that we will read the Torah and I wish you to be able to read the Torah '
s["origin"]=(71.43280029296875, 351.65496826171875), s["text"]='in the original language, Hebrew. Meanwhile, I’m happy to present to you the “The '
s["origin"]=(71.43280029296875, 368.4549560546875), s["text"]='Torah tells me” in Georgian with lovely illustrations.'
s["origin"]=(284.3070068359375, 402.0549621582031), s["text"]='*******'
s["origin"]=(71.43280029296875, 435.65496826171875), s["text"]='I would like to thank '
s["origin"]=(191.58079528808594, 435.65496826171875), s["text"]='“The Rothschild Foundation EU”'
s["origin"]=(392.68798828125, 435.65496826171875), s["text"]=' for their '
s["origin"]=(445.1669921875, 435.65496826171875), s["text"]='fi'
s["origin"]=(449.0603942871094, 435.65496826171875), s["text"]=' nancial support.'
s["origin"]=(71.43280029296875, 469.2549743652344), s["text"]='Particular thanks to the team that without them this book would not be published:  '
s["origin"]=(71.43280029296875, 486.0549621582031), s["text"]='Mrs. Marina Baazov, Mrs. Tzippora Kozlovsky, Mrs. Svetlana Chachanashvili,'
s["origin"]=(71.43280029296875, 502.8549499511719), s["text"]='Mrs. Sara Feinstein, Mrs. Salome Filpan, Rabbi Ben-Zion Israelshvili, '
s["origin"]=(71.43280029296875, 519.6549682617188), s["text"]='Mr. Aharon Janashvili, Mr. Menachem Kozlovsky and the Illustrator '
s["origin"]=(71.43280029296875, 536.4549560546875), s["text"]='Mrs. Devorah Kozlovsky (dmkozo@gmail.com).'
s["origin"]=(71.43280029296875, 570.054931640625), s["text"]='Using this opportunity, I would like to pass my grateful thanks to the'
s["origin"]=(514.0469970703125, 570.054931640625), s["text"]=' “Or '
s["origin"]=(71.43280029296875, 586.8549194335938), s["text"]='Avner foundation”, “The Leviov Foundation”,'
s["origin"]=(373.9909973144531, 586.8549194335938), s["text"]=' and last but not least to '
s["origin"]=(71.43280029296875, 603.6549072265625), s["text"]='Mr. Michael Mirilashvili'
s["origin"]=(220.6154022216797, 603.6549072265625), s["text"]=' for  their ongoing support of the “Or Avner Jewish Day '
s["origin"]=(71.43280029296875, 620.4548950195312), s["text"]='school” in Tbilisi.'
s["origin"]=(509.8623962402344, 687.6549072265625), s["text"]='Yours,'
s["origin"]=(417.86419677734375, 704.4548950195312), s["text"]='Rabbi Meir Kozlovsky'
s["origin"]=(295.29779052734375, 819.7260131835938), s["text"]='2'

pymupdf / PyMuPDF

I get same origin y values and bbox values for obviously different spans #3689

Description of the bug

How to reproduce the bug

PyMuPDF version

Operating system

Python version