thiswillbeyourgithub / LogseqPDFImporter

Import pdf into logseq but also import annotations made from other softwares
GNU General Public License v3.0
28 stars 4 forks source link

Non-unique UUID (ReadCube) #7

Closed mp68 closed 4 months ago

mp68 commented 5 months ago

Thank you for your great work on making this possible for Logseq! For research purposes I'm using the Readcube Papers app as a reference manager and for annotations on device. It has the nice possibility to export the annotations embedded in a PDF file. Unfortunately, they appear to use non-unique UUIDs. It would be amazing if there is a workaround for this, as the Papers app + Logseq would make for a really nice research workflow.

Console error:

{'page': 5, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 70.30518341064453, 'y1': 276.05499267578125, 'x2': 545.44482421875, 'y2': 337.89501953125, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 70.30518341064453, 'y1': 276.05499267578125, 'x2': 545.44482421875, 'y2': 337.89501953125, 'width': 612.0, 'height': 792.0}], 'page': 5}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 5, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 69.92518615722656, 'y1': 326.4649963378906, 'x2': 545.474853515625, 'y2': 363.45501708984375, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 69.92518615722656, 'y1': 326.4649963378906, 'x2': 545.474853515625, 'y2': 363.45501708984375, 'width': 612.0, 'height': 792.0}], 'page': 5}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 5, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 71.25518798828125, 'y1': 352.0249938964844, 'x2': 545.44482421875, 'y2': 388.6549987792969, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 71.25518798828125, 'y1': 352.0249938964844, 'x2': 545.44482421875, 'y2': 388.6549987792969, 'width': 612.0, 'height': 792.0}], 'page': 5}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 8, 'properties': {'color': 'red'}, 'position': {'bounding': {'x1': 71.52500915527344, 'y1': 151.85498046875, 'x2': 543.7050170898438, 'y2': 213.864990234375, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 71.52500915527344, 'y1': 151.85498046875, 'x2': 543.7050170898438, 'y2': 213.864990234375, 'width': 612.0, 'height': 792.0}], 'page': 8}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 8, 'properties': {'color': 'red'}, 'position': {'bounding': {'x1': 71.84500122070312, 'y1': 556.6749877929688, 'x2': 543.7850341796875, 'y2': 593.4849853515625, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 71.84500122070312, 'y1': 556.6749877929688, 'x2': 543.7850341796875, 'y2': 593.4849853515625, 'width': 612.0, 'height': 792.0}], 'page': 8}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 8, 'properties': {'color': 'red'}, 'position': {'bounding': {'x1': 71.66500854492188, 'y1': 582.0549926757812, 'x2': 543.7349853515625, 'y2': 618.6849975585938, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 71.66500854492188, 'y1': 582.0549926757812, 'x2': 543.7349853515625, 'y2': 618.6849975585938, 'width': 612.0, 'height': 792.0}], 'page': 8}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 8, 'properties': {'color': 'red'}, 'position': {'bounding': {'x1': 72.44499969482422, 'y1': 632.4549560546875, 'x2': 540.0050048828125, 'y2': 669.2650146484375, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 72.44499969482422, 'y1': 632.4549560546875, 'x2': 540.0050048828125, 'y2': 669.2650146484375, 'width': 612.0, 'height': 792.0}], 'page': 8}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 8, 'properties': {'color': 'red'}, 'position': {'bounding': {'x1': 71.68499755859375, 'y1': 683.2149658203125, 'x2': 543.8150024414062, 'y2': 719.8450317382812, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 71.68499755859375, 'y1': 683.2149658203125, 'x2': 543.8150024414062, 'y2': 719.8450317382812, 'width': 612.0, 'height': 792.0}], 'page': 8}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 10, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 69.72518920898438, 'y1': 273.53497314453125, 'x2': 545.46484375, 'y2': 310.5250244140625, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 69.72518920898438, 'y1': 273.53497314453125, 'x2': 545.46484375, 'y2': 310.5250244140625, 'width': 612.0, 'height': 792.0}], 'page': 10}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 10, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 70.24518585205078, 'y1': 299.0950012207031, 'x2': 545.434814453125, 'y2': 335.7350158691406, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 70.24518585205078, 'y1': 299.0950012207031, 'x2': 545.434814453125, 'y2': 335.7350158691406, 'width': 612.0, 'height': 792.0}], 'page': 10}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 10, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 70.30518341064453, 'y1': 324.30499267578125, 'x2': 545.454833984375, 'y2': 411.69500732421875, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 70.30518341064453, 'y1': 324.30499267578125, 'x2': 545.454833984375, 'y2': 411.69500732421875, 'width': 612.0, 'height': 792.0}], 'page': 10}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 12, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 70.10517883300781, 'y1': 174.7149658203125, 'x2': 545.454833984375, 'y2': 211.5250244140625, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 70.10517883300781, 'y1': 174.7149658203125, 'x2': 545.454833984375, 'y2': 211.5250244140625, 'width': 612.0, 'height': 792.0}], 'page': 12}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 12, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 70.30519104003906, 'y1': 200.094970703125, 'x2': 545.46484375, 'y2': 262.10498046875, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 70.30519104003906, 'y1': 200.094970703125, 'x2': 545.46484375, 'y2': 262.10498046875, 'width': 612.0, 'height': 792.0}], 'page': 12}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 12, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 69.77519226074219, 'y1': 250.67498779296875, 'x2': 545.4948120117188, 'y2': 337.89501953125, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 69.77519226074219, 'y1': 250.67498779296875, 'x2': 545.4948120117188, 'y2': 337.89501953125, 'width': 612.0, 'height': 792.0}], 'page': 12}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 12, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 69.69517517089844, 'y1': 554.1649780273438, 'x2': 545.434814453125, 'y2': 616.35498046875, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 69.69517517089844, 'y1': 554.1649780273438, 'x2': 545.434814453125, 'y2': 616.35498046875, 'width': 612.0, 'height': 792.0}], 'page': 12}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 13, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 70.54519653320312, 'y1': 225.29498291015625, 'x2': 545.454833984375, 'y2': 262.10498046875, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 70.54519653320312, 'y1': 225.29498291015625, 'x2': 545.454833984375, 'y2': 262.10498046875, 'width': 612.0, 'height': 792.0}], 'page': 13}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 13, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 69.72518920898438, 'y1': 276.05499267578125, 'x2': 545.474853515625, 'y2': 312.875, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 69.72518920898438, 'y1': 276.05499267578125, 'x2': 545.474853515625, 'y2': 312.875, 'width': 612.0, 'height': 792.0}], 'page': 13}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 13, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 69.87518310546875, 'y1': 326.4649963378906, 'x2': 545.454833984375, 'y2': 363.45501708984375, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 69.87518310546875, 'y1': 326.4649963378906, 'x2': 545.454833984375, 'y2': 363.45501708984375, 'width': 612.0, 'height': 792.0}], 'page': 13}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 13, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 69.98518371582031, 'y1': 352.0249938964844, 'x2': 545.44482421875, 'y2': 439.2350158691406, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 69.98518371582031, 'y1': 352.0249938964844, 'x2': 545.44482421875, 'y2': 439.2350158691406, 'width': 612.0, 'height': 792.0}], 'page': 13}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 13, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 70.0451889038086, 'y1': 427.80499267578125, 'x2': 545.48486328125, 'y2': 489.81500244140625, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 70.0451889038086, 'y1': 427.80499267578125, 'x2': 545.48486328125, 'y2': 489.81500244140625, 'width': 612.0, 'height': 792.0}], 'page': 13}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 14, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 70.04519653320312, 'y1': 73.7349853515625, 'x2': 545.44482421875, 'y2': 135.56500244140625, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 70.04519653320312, 'y1': 73.7349853515625, 'x2': 545.44482421875, 'y2': 135.56500244140625, 'width': 612.0, 'height': 792.0}], 'page': 14}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 17, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 69.92517852783203, 'y1': 200.27496337890625, 'x2': 545.474853515625, 'y2': 262.2850341796875, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 69.92517852783203, 'y1': 200.27496337890625, 'x2': 545.474853515625, 'y2': 262.2850341796875, 'width': 612.0, 'height': 792.0}], 'page': 17}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 17, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 70.30518341064453, 'y1': 427.9749755859375, 'x2': 545.44482421875, 'y2': 464.4250183105469, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 70.30518341064453, 'y1': 427.9749755859375, 'x2': 545.44482421875, 'y2': 464.4250183105469, 'width': 612.0, 'height': 792.0}], 'page': 17}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 17, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 69.9551773071289, 'y1': 503.7550048828125, 'x2': 545.46484375, 'y2': 540.385009765625, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 69.9551773071289, 'y1': 503.7550048828125, 'x2': 545.46484375, 'y2': 540.385009765625, 'width': 612.0, 'height': 792.0}], 'page': 17}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 17, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 69.9251937866211, 'y1': 630.125, 'x2': 545.48486328125, 'y2': 692.3150024414062, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 69.9251937866211, 'y1': 630.125, 'x2': 545.48486328125, 'y2': 692.3150024414062, 'width': 612.0, 'height': 792.0}], 'page': 17}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 17, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 69.77519226074219, 'y1': 604.9149780273438, 'x2': 545.434814453125, 'y2': 641.5549926757812, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 69.77519226074219, 'y1': 604.9149780273438, 'x2': 545.434814453125, 'y2': 641.5549926757812, 'width': 612.0, 'height': 792.0}], 'page': 17}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 62, 'properties': {'color': 'red'}, 'position': {'bounding': {'x1': 40.17000198364258, 'y1': 62.08001708984375, 'x2': 585.0899658203125, 'y2': 734.219970703125, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 40.17000198364258, 'y1': 62.08001708984375, 'x2': 585.0899658203125, 'y2': 734.219970703125, 'width': 612.0, 'height': 792.0}], 'page': 62}, 'content': {'text': '[:span]', 'image_id': '62_78365c7b-a656-3539-9f45-c61c6ee3a8ba'}, 'id #uuid': '78365c7b-a656-3539-9f45-c61c6ee3a8ba', 'author': 'mp68'}
{'page': 62, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 260.1609191894531, 'y1': 70.8699951171875, 'x2': 278.1609191894531, 'y2': 88.8699951171875, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 260.1609191894531, 'y1': 70.8699951171875, 'x2': 278.1609191894531, 'y2': 88.8699951171875, 'width': 612.0, 'height': 792.0}], 'page': 62}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
{'page': 62, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 260.1609191894531, 'y1': 70.8699951171875, 'x2': 278.1609191894531, 'y2': 88.8699951171875, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 260.1609191894531, 'y1': 70.8699951171875, 'x2': 278.1609191894531, 'y2': 88.8699951171875, 'width': 612.0, 'height': 792.0}], 'page': 62}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 5, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 70.30518341064453, 'y1': 276.05499267578125, 'x2': 545.44482421875, 'y2': 337.89501953125, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 70.30518341064453, 'y1': 276.05499267578125, 'x2': 545.44482421875, 'y2': 337.89501953125, 'width': 612.0, 'height': 792.0}], 'page': 5}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 5, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 69.92518615722656, 'y1': 326.4649963378906, 'x2': 545.474853515625, 'y2': 363.45501708984375, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 69.92518615722656, 'y1': 326.4649963378906, 'x2': 545.474853515625, 'y2': 363.45501708984375, 'width': 612.0, 'height': 792.0}], 'page': 5}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 5, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 71.25518798828125, 'y1': 352.0249938964844, 'x2': 545.44482421875, 'y2': 388.6549987792969, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 71.25518798828125, 'y1': 352.0249938964844, 'x2': 545.44482421875, 'y2': 388.6549987792969, 'width': 612.0, 'height': 792.0}], 'page': 5}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 8, 'properties': {'color': 'red'}, 'position': {'bounding': {'x1': 71.52500915527344, 'y1': 151.85498046875, 'x2': 543.7050170898438, 'y2': 213.864990234375, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 71.52500915527344, 'y1': 151.85498046875, 'x2': 543.7050170898438, 'y2': 213.864990234375, 'width': 612.0, 'height': 792.0}], 'page': 8}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 8, 'properties': {'color': 'red'}, 'position': {'bounding': {'x1': 71.84500122070312, 'y1': 556.6749877929688, 'x2': 543.7850341796875, 'y2': 593.4849853515625, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 71.84500122070312, 'y1': 556.6749877929688, 'x2': 543.7850341796875, 'y2': 593.4849853515625, 'width': 612.0, 'height': 792.0}], 'page': 8}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 8, 'properties': {'color': 'red'}, 'position': {'bounding': {'x1': 71.66500854492188, 'y1': 582.0549926757812, 'x2': 543.7349853515625, 'y2': 618.6849975585938, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 71.66500854492188, 'y1': 582.0549926757812, 'x2': 543.7349853515625, 'y2': 618.6849975585938, 'width': 612.0, 'height': 792.0}], 'page': 8}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 8, 'properties': {'color': 'red'}, 'position': {'bounding': {'x1': 72.44499969482422, 'y1': 632.4549560546875, 'x2': 540.0050048828125, 'y2': 669.2650146484375, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 72.44499969482422, 'y1': 632.4549560546875, 'x2': 540.0050048828125, 'y2': 669.2650146484375, 'width': 612.0, 'height': 792.0}], 'page': 8}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 8, 'properties': {'color': 'red'}, 'position': {'bounding': {'x1': 71.68499755859375, 'y1': 683.2149658203125, 'x2': 543.8150024414062, 'y2': 719.8450317382812, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 71.68499755859375, 'y1': 683.2149658203125, 'x2': 543.8150024414062, 'y2': 719.8450317382812, 'width': 612.0, 'height': 792.0}], 'page': 8}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 10, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 69.72518920898438, 'y1': 273.53497314453125, 'x2': 545.46484375, 'y2': 310.5250244140625, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 69.72518920898438, 'y1': 273.53497314453125, 'x2': 545.46484375, 'y2': 310.5250244140625, 'width': 612.0, 'height': 792.0}], 'page': 10}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 10, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 70.24518585205078, 'y1': 299.0950012207031, 'x2': 545.434814453125, 'y2': 335.7350158691406, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 70.24518585205078, 'y1': 299.0950012207031, 'x2': 545.434814453125, 'y2': 335.7350158691406, 'width': 612.0, 'height': 792.0}], 'page': 10}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 10, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 70.30518341064453, 'y1': 324.30499267578125, 'x2': 545.454833984375, 'y2': 411.69500732421875, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 70.30518341064453, 'y1': 324.30499267578125, 'x2': 545.454833984375, 'y2': 411.69500732421875, 'width': 612.0, 'height': 792.0}], 'page': 10}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 12, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 70.10517883300781, 'y1': 174.7149658203125, 'x2': 545.454833984375, 'y2': 211.5250244140625, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 70.10517883300781, 'y1': 174.7149658203125, 'x2': 545.454833984375, 'y2': 211.5250244140625, 'width': 612.0, 'height': 792.0}], 'page': 12}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 12, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 70.30519104003906, 'y1': 200.094970703125, 'x2': 545.46484375, 'y2': 262.10498046875, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 70.30519104003906, 'y1': 200.094970703125, 'x2': 545.46484375, 'y2': 262.10498046875, 'width': 612.0, 'height': 792.0}], 'page': 12}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 12, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 69.77519226074219, 'y1': 250.67498779296875, 'x2': 545.4948120117188, 'y2': 337.89501953125, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 69.77519226074219, 'y1': 250.67498779296875, 'x2': 545.4948120117188, 'y2': 337.89501953125, 'width': 612.0, 'height': 792.0}], 'page': 12}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 12, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 69.69517517089844, 'y1': 554.1649780273438, 'x2': 545.434814453125, 'y2': 616.35498046875, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 69.69517517089844, 'y1': 554.1649780273438, 'x2': 545.434814453125, 'y2': 616.35498046875, 'width': 612.0, 'height': 792.0}], 'page': 12}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 13, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 70.54519653320312, 'y1': 225.29498291015625, 'x2': 545.454833984375, 'y2': 262.10498046875, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 70.54519653320312, 'y1': 225.29498291015625, 'x2': 545.454833984375, 'y2': 262.10498046875, 'width': 612.0, 'height': 792.0}], 'page': 13}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 13, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 69.72518920898438, 'y1': 276.05499267578125, 'x2': 545.474853515625, 'y2': 312.875, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 69.72518920898438, 'y1': 276.05499267578125, 'x2': 545.474853515625, 'y2': 312.875, 'width': 612.0, 'height': 792.0}], 'page': 13}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 13, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 69.87518310546875, 'y1': 326.4649963378906, 'x2': 545.454833984375, 'y2': 363.45501708984375, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 69.87518310546875, 'y1': 326.4649963378906, 'x2': 545.454833984375, 'y2': 363.45501708984375, 'width': 612.0, 'height': 792.0}], 'page': 13}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 13, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 69.98518371582031, 'y1': 352.0249938964844, 'x2': 545.44482421875, 'y2': 439.2350158691406, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 69.98518371582031, 'y1': 352.0249938964844, 'x2': 545.44482421875, 'y2': 439.2350158691406, 'width': 612.0, 'height': 792.0}], 'page': 13}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 13, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 70.0451889038086, 'y1': 427.80499267578125, 'x2': 545.48486328125, 'y2': 489.81500244140625, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 70.0451889038086, 'y1': 427.80499267578125, 'x2': 545.48486328125, 'y2': 489.81500244140625, 'width': 612.0, 'height': 792.0}], 'page': 13}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 14, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 70.04519653320312, 'y1': 73.7349853515625, 'x2': 545.44482421875, 'y2': 135.56500244140625, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 70.04519653320312, 'y1': 73.7349853515625, 'x2': 545.44482421875, 'y2': 135.56500244140625, 'width': 612.0, 'height': 792.0}], 'page': 14}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 17, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 69.92517852783203, 'y1': 200.27496337890625, 'x2': 545.474853515625, 'y2': 262.2850341796875, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 69.92517852783203, 'y1': 200.27496337890625, 'x2': 545.474853515625, 'y2': 262.2850341796875, 'width': 612.0, 'height': 792.0}], 'page': 17}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 17, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 70.30518341064453, 'y1': 427.9749755859375, 'x2': 545.44482421875, 'y2': 464.4250183105469, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 70.30518341064453, 'y1': 427.9749755859375, 'x2': 545.44482421875, 'y2': 464.4250183105469, 'width': 612.0, 'height': 792.0}], 'page': 17}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 17, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 69.9551773071289, 'y1': 503.7550048828125, 'x2': 545.46484375, 'y2': 540.385009765625, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 69.9551773071289, 'y1': 503.7550048828125, 'x2': 545.46484375, 'y2': 540.385009765625, 'width': 612.0, 'height': 792.0}], 'page': 17}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 17, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 69.9251937866211, 'y1': 630.125, 'x2': 545.48486328125, 'y2': 692.3150024414062, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 69.9251937866211, 'y1': 630.125, 'x2': 545.48486328125, 'y2': 692.3150024414062, 'width': 612.0, 'height': 792.0}], 'page': 17}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 17, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 69.77519226074219, 'y1': 604.9149780273438, 'x2': 545.434814453125, 'y2': 641.5549926757812, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 69.77519226074219, 'y1': 604.9149780273438, 'x2': 545.434814453125, 'y2': 641.5549926757812, 'width': 612.0, 'height': 792.0}], 'page': 17}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 62, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 260.1609191894531, 'y1': 70.8699951171875, 'x2': 278.1609191894531, 'y2': 88.8699951171875, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 260.1609191894531, 'y1': 70.8699951171875, 'x2': 278.1609191894531, 'y2': 88.8699951171875, 'width': 612.0, 'height': 792.0}], 'page': 62}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Non unique id for this annotation: {'page': 62, 'properties': {'color': 'yellow'}, 'position': {'bounding': {'x1': 260.1609191894531, 'y1': 70.8699951171875, 'x2': 278.1609191894531, 'y2': 88.8699951171875, 'width': 612.0, 'height': 792.0}, 'rects': [{'x1': 260.1609191894531, 'y1': 70.8699951171875, 'x2': 278.1609191894531, 'y2': 88.8699951171875, 'width': 612.0, 'height': 792.0}], 'page': 62}, 'content': {'text': ''}, 'id #uuid': 'ae006a11-e144-389f-b5c6-641478c1a4be', 'author': 'mp68'}
Traceback (most recent call last):
  File "/Users/mp68/Research/tools/LogseqPDFImporter/LogseqPDFImporter.py", line 388, in <module>
    inst = fire.Fire(main)
           ^^^^^^^^^^^^^^^
  File "/Users/mp68/anaconda3/lib/python3.11/site-packages/fire/core.py", line 143, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mp68/anaconda3/lib/python3.11/site-packages/fire/core.py", line 477, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
                                ^^^^^^^^^^^^^^^^^^^^
  File "/Users/mp68/anaconda3/lib/python3.11/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mp68/Research/tools/LogseqPDFImporter/LogseqPDFImporter.py", line 315, in main
    raise Exception("Some annotations uuid were not unique! "
Exception: Some annotations uuid were not unique! The uuid is derived from the text content or the image location.
thiswillbeyourgithub commented 5 months ago

Hi, thanks for the kind words :)

I just pushed a new branch : https://github.com/thiswillbeyourgithub/LogseqPDFImporter/tree/fix_nonunique_uuids

I added an argument to specify what to do for non unique UUID. Please do the following:

  1. use the value "exit" to first check if the code indeed crashes like it used to
  2. use the value "remove" to just discard duplicate annotations

I added a sanity check that verifies that the annotations are indeeed full duplicate, please report back if you see prints indicating that "Annotations with the same UUID are actually different"

Please report back here so that I can merge this with the main branch if this works fine.

mp68 commented 5 months ago

Wow, thank you for your incredible fast reply! I tested the branch and can report the following:

  1. using the value "exit" results in crashing like it used to
  2. using the value "remove" results in "Annotations with the same UUID are actually different", see below. If it helps, I could email you a sample annotated PDF file that causes the problems.

Log output (too large for copy&paste): https://gist.github.com/mp68/4eba668a63d3f9a5b9c95c53cde93592

thiswillbeyourgithub commented 5 months ago

Thanks for reporting. That showed that I had made a mistake: the UUID was derived from the filename + text, but your annotations all have empty text so ended up having the same UUID.

Can you try again please?

I expect that you will have inded the colored area at the location of the highlight in the pdf, but unfortunately it might indicate also that all your annotations will be devoid of text... If that happens, do try to play with the text_boundary_threshold argument and report back!

To help that, I also added a new print that tells if an annotation has empty text

thiswillbeyourgithub commented 5 months ago

(Do check that you have the latest commit: d4ac839bd0cc6b10c18cb0d6e14af0d5232d6a29 !)

mp68 commented 5 months ago

It's working great for the annotations to be correctly recognised! However, the annotations themselves can not be used as a link or reference as their text content is empty. This also results in a visually broken annotation page. I believe there is no other way to extract the text again? So I would suggest to replace the empty text with the filename + "reference" + increasing number.

thiswillbeyourgithub commented 5 months ago

Can you provide pictures?

mp68 commented 5 months ago

Sure! 😊 image Annotations are well integrated into the pdf

image The annotations page is broken

thiswillbeyourgithub commented 5 months ago

I believe there is no other way to extract the text again?

Well you can delete the annotation page and rerun LogseqPDFImporter and try to modify the arguments for the overlap

So I would suggest to replace the empty text with the filename + "reference" + increasing number.

What do you mean by reference?

I pushed some fix, can you try it and provide pictures please?

thiswillbeyourgithub commented 5 months ago

(Nothing to do with that but given the type of pdf you're reading and being a medical student myself you might be interested in taking a look at my other repos. For example DocToolsLLM and anki related stuff. I have more repos to create in the coming 12 months too that are geared towards education.)

mp68 commented 5 months ago

image Works great now! Images are still broken, but I think it's a path problem. This is what gets generated in the assets folder: image

What do you mean by reference?

Just as you did it but in a more "elegant" way 😊 So the filler text of the annotation will be "Bataller-Outcomes and genetic dynamics of acute myeloid leukemia at first relapse-2020-Haematologica Reference 12" instead of "Notext 12"

(Nothing to do with that but given the type of pdf you're reading and being a medical student myself you might be interested in taking a look at my other repos. For example DocToolsLLM and anki related stuff. I have more repos to create in the coming 12 months too that are geared towards education.)

Very interesting stuff, thank you for sharing these! Will keep an eye on them. Where are you based at?

thiswillbeyourgithub commented 5 months ago

Where are you based at?

Too privacy conscious for sharing that sorry :)

I'll see to the rest another day

thiswillbeyourgithub commented 5 months ago

I pushed a commit for the new filename, can you check if it's better for you?

Also can you investigate a bit the path issue for the images? For example by telling me the fullpath of the images in the assets folder and the path / id indicated in the .edn file?

thiswillbeyourgithub commented 4 months ago

Up

thiswillbeyourgithub commented 4 months ago

Without answers from you I decided to go ahead and merge the two branch. I'm still bothered by this thing about image paths being wrong so please don't hesitate to re-open this issue when you have time to share with me some more infos so that I can fix it.