jimmejardine / qiqqa-open-source

The open-sourced version of the award-winning Qiqqa research management tool for Windows
GNU General Public License v3.0
366 stars 60 forks source link

Is there a way to replace or update my pdf in qiqqa #317

Open safaabasabreen opened 3 years ago

safaabasabreen commented 3 years ago

Because I cannot use qiqqa-open-source on my phone, I continue my reading and taking notes on Adobe reader. However, I can not integrate notes and add them back to my qiqqa pdf. That makes me wonder if there is such a function in qiqqa allows me to replace or update the existing pdf and upload a new version of it without a need to delete all that relevant to the old pdf and insert all the details, including reference information.

Thank you

GerHobbelt commented 3 years ago

Unfortunately this function is currently not available in Qiqqa.

Given the technologies underpinning Qiqqa, it would probably be easier to add some sort of "annotations import/export" function to Qiqqa, but I realize that's a crutch. Sorry, not much help ATM.


Elaboration:

Technically, Qiqqa identifies any document by its content hash; when you edit a PDF this would thus be identified as a "new" PDF and added to the database. The problem then is that Qiqqa has no swift way to link these PDF instances ("category/type of duplication" functionality is looked at but not available yet (for situations like this one and, for example, when subsequent versions/revisions of a paper are published.)

safaabasabreen commented 3 years ago

Thank you for your fast response. I love qiqqa and it already has great functions so I am sure you will do something about it.

Good luck ๐Ÿ‘

Safaa

-- Sent from myMail for Android

Sunday, 18 April 2021, 00:13am +01:00 from Ger Hobbelt @.**@.>:

Unfortunately this function is currently not available in Qiqqa.

Given the technologies underpinning Qiqqa, it would probably be easier to add some sort of "annotations import/export" function to Qiqqa, but I realize that's a crutch. Sorry, not much help ATM.


Elaboration:

Technically, Qiqqa identifies any document by its content hash; when you edit a PDF this would thus be identified as a "new" PDF and added to the database. The problem then is that Qiqqa has no swift way to link these PDF instances ("category/type of duplication" functionality is looked at but not available yet (for situations like this one and, for example, when subsequent versions/revisions of a paper are published.)

โ€” You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/jimmejardine/qiqqa-open-source/issues/317#issuecomment-821900456, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ARYFAC4EAHIM4HAVECST2DTTJII27ANCNFSM43DH3NLA.

klmckinney commented 3 years ago

I have been looking at this issue. I mostly read and comment on my .PDFs using FoxIt but I believe this workflow below works for Adobe also. I've discovered/tried a few things: 1) It is possible to pull highlights from the qiqqa.library file (which is an sqlite3 database) and add them as annotations into a .PDF. This creates a new .PDF file. You can then read/comment/annotate this .PDF in your tool of choice. 2) It is possible to pull highlights/annotations from the "new" .PDF and store time in the qiqqa.library file, BUT Qiqqa is currently limited in the annotations it understands and can display. 3) I currently do these operations with Python scripts using the PyMuPDF library because was the easiest way for me to get it going. 4) I believe the whole subject of annotations can be a bit confusing in the .PDF world. The Adobe standard defines a bunch of possible annotations types. Not all viewer/editors support all types. Even "highlights" appear to have more variations than the Qiqqa viewer currently supports.

@GerHobbelt , This is actually a subject that interests me a lot. Both the "how to make a better performing PDF viewer in Qiqqa" and also how to support more .PDF annotation types. I settled on the "work directly with the qiqqa.library" solution as it was the quickest way to get to my goal. (And I think I saw you doing similar "direct to qiqqa.library" in some earlier qiqqa posts, which got me thinking this way.) I'd be happy to work some of this into the codebase if I could get some "coaching" on what to put where.

GerHobbelt commented 3 years ago

@klmckinney : wow! ๐Ÿ‘ Particularly ยง2: storing the annotations in the sqlite DB: qiqqa records include a MD5 content hash and my own work re writing DB records (back when qiqqa was still commercial) turned up negative results for a long time as qiqqa discarded my edits due to MD5 hash mismatch. (That was what it did for the (JSON-encoded) BibTeX records anyway.)

The sqlite work (qiqqa.library I/O) you are referring to was reverse engineering work from back in the commercial days: https://github.com/GerHobbelt/qiqqa-revengin

(I'd have to check the code now to see if the MD5 checks also do apply to the annotation records... ๐Ÿค” Regrettably the Qiqqa codebase isn't exactly sticking in my head when I haven't touched it very recently. ๐Ÿ˜ž )

Anyway, I digress.


[Edit: cr**. Cut&dumped what I was writing here.]

TL;DR: coaching: would love to but right now I don't have a quick satisfaction path for that: I see some highly interesting efforts of yours which nicely mesh with the new ideas, but the big "but" there is the fact that these currently still are ideas, which have gone through some feasibility studies and stuff like that, but are not yet present in the qiqqa codebase.

That drives me to think about how we can make this something that gets some tangible results relatively quickly so we don't get bogged down (like I already am in the PDF treatment path ๐Ÿ˜“ -- no-one but lil' ol' me who can get myself out of that swamp. Needs some anger again to get it done there.)

@klmckinney : what you could do for starters is get your current work into a fresh github repo, maybe add a README and a bunch of notes what it does, so we both have something to look at that is "known working". No matter how fast or slow we are after that, we'll at least have that then, accessible for everyone.

Next, either I get funky and add a web hook into Qiqqa (the "localhost:9194" think I'm talking about in here) or we come up with something else to hook this into Qiqqa.

Before we go there and come up with tasks & goals, best to check implicit assumptions:

Anyway, let me know what areas you're comfortable in and where you get your kicks so we can iron something out that works, also motivation-wise. (Open Source is stamina and drive, for a large part, so best to pick something that you keep going for intrinsically.) If you like that better, we can set up a voice or vid call and see where it gets us. ๐Ÿ˜„

klmckinney commented 3 years ago

Hi Ger, and others,

I do much of my day to day work in C and/or Python. I find Python is 1 (and sometimes nearly 2) orders of magnitude faster for development work. Twenty years ago I did a lot of C++, and I still can, but for most of the daily work it is C and Python. I have not done much C# because I've never had a project that needed it. But I'm happy do it as necessary..

I like the idea of Webhook or other external interface into Qiqqa. My current "direct manipulation of database" appears to work but I do see sometimes unexpected slowdown that leads me to believe I might be missing something behind the scenes.

I can do Visual Studio. I have both community (of course as it is free) and professional.

I am not as fluent in GIT as I am in Perforce, but I can do GIT/GITHUB.

I understand WPF and XAML but never have had a reason to do serious work with it.

Web Design? Actually, I HATE IT. I am just not a visual person. In fact, even in this day and age I spend much of my time at command prompts (well, except for my PDF viewer!).

Web APIs? Like REST apis. Many of these rock! I am happy to work in this. It is just the "visual web stuff" that I dislike.

In summary:
Currently I am most interested in advancing PDF annotations and "interoperability" with external PDF viewers. I had some thoughts about improving the built in viewer but then thought if there is a way to develop an interface directly work with FoxIT or Acrobat that would be a better use of time.

My preferred development workflow is to use Python and call into relevant .DLLs as necessary. Note, developing concept/features in Python doesn't necessarily mean deployment in Python, as they could be recoded C. But if performance is adequate, I often leave work in Python. But as mentioned above, I can provide code in C/C++ (and C#) as necessary.

Let's chat on voice/video call sometime soon. I have Teams as part of work and also my Microsoft 365 account. And there is Zoom and Jitsi other platforms. Lots of choices.

-- Kenelm McKinney @.***

On Wed, Apr 21, 2021, at 9:59 AM, Ger Hobbelt wrote:

@klmckinney https://github.com/klmckinney : wow! ๐Ÿ‘ Particularly ยง2: storing the annotations in the sqlite DB: qiqqa records include a MD5 content hash and my own work re writing DB records (back when qiqqa was still commercial) turned up negative results for a long time as qiqqa discarded my edits due to MD5 hash mismatch. (That was what it did for the (JSON-encoded) BibTeX records anyway.)

The sqlite work (qiqqa.library I/O) you are referring to was reverse engineering work from back in the commercial days: https://github.com/GerHobbelt/qiqqa-revengin

(I'd have to check the code now to see if the MD5 checks also do apply to the annotation records... ๐Ÿค” Regrettably the Qiqqa codebase isn't exactly sticking in my head when I haven't touched it very recently. ๐Ÿ˜ž )

Anyway, I digress.

[Edit: cr**. Cut&dumped what I was writing here https://github.com/jimmejardine/qiqqa-open-source/blob/master/docs-src/Progress%20in%20Development/Considering%20the%20Way%20Forward/Working%20with%20annotations.md.]

TL;DR: coaching: would love to but right now I don't have a quick satisfaction path for that: I see some highly interesting efforts of yours which nicely mesh with the new ideas, but the big "but" there is the fact that these currently still are ideas, which have gone through some feasibility studies and stuff like that, but are not yet present in the qiqqa codebase.

That drives me to think about how we can make this something that gets some tangible results relatively quickly so we don't get bogged down (like I already am in the PDF treatment path ๐Ÿ˜“ -- no-one but lil' ol' me who can get myself out of that swamp. Needs some anger again to get it done there.)

@klmckinney https://github.com/klmckinney : what you could do for starters is get your current work into a fresh github repo, maybe add a README and a bunch of notes what it does, so we both have something to look at that is "known working". No matter how fast or slow we are after that, we'll at least have that then, accessible for everyone.

Next, either I get funky and add a web hook into Qiqqa (the "localhost:9194" think I'm talking about in here https://github.com/jimmejardine/qiqqa-open-source/blob/master/docs-src/Progress%20in%20Development/Considering%20the%20Way%20Forward/Working%20with%20annotations.md) or we come up with something else to hook this into Qiqqa.

Before we go there and come up with tasks & goals, best to check implicit assumptions:

  • how comfortable are you with C (and C++) code?
  • C#? WPF?
  • Visual Studio in general? git, git submodules? (Qiqqa project has dependencies and is pretty large)
  • web design? (because I'm also thinking: if someone can take the web-tech based UI redo/upgrade/redesign that is to be out of my hands, that would also be really something; a long goal but very useful, for then I don't have to bother with the CSS and everything UI visual fancy all that much ๐Ÿ˜„ )
  • using "web APIs" and such; I'm old skool so that's all "socket programming" in my lizard brain, but you & I know that doesn't cover it, really. It's protocols, formats, etc. Just a bit more than only basic JSON back & forth is a plus here as there may be binary data flying around. ๐Ÿ˜„

Anyway, let me know what areas you're comfortable in and where you get your kicks so we can iron something out that works, also motivation-wise. (Open Source is stamina and drive, for a large part, so best to pick something that you keep going for intrinsically.) If you like that better, we can set up a voice or vid call and see where it gets us. ๐Ÿ˜„

โ€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jimmejardine/qiqqa-open-source/issues/317#issuecomment-824084575, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACSZLTOBKJWZWVZE5S6DBODTJ3K53ANCNFSM43DH3NLA.

headshrinker81 commented 3 years ago

Just a feedback from a user. I just had thought that I had found the holy grail of a reference manager with included mind map functionality. At first qiqqa looked so much easier to use than Docear (which seems to be dead anyways) and I started toying around with the importing of pdf documents from my hard drive and the creation of a mind map. I am shattered that qiqqa cannot handle annotations made with pdf viewers like foxit now :( From what I am reading above this does not seem to be an easy task. Does anyone know a workaround for this issue or an alternative tool which incorporates this functionality?

Thanks and a good weekend to all

GerHobbelt commented 3 years ago

@headshrinker81 - Qiqqa has some "pdf annotation extract" abilities (which I have never checked out, I must confess) in the PDF viewer:

image

but that's unfortunately not covering what we need (assuming this works like it says on the tin ๐Ÿ˜‰ -- I haven't yet tested and used all corners in Qiqqa since it became open source): when you import an annotated PDF, the file content includes those annotations, so they are part of the "content hash" which identifies the document. Change, edit or otherwise alter your annotations and the document is thus recognized as "another new one" on re-import due to changed "content hash", so, yes, what I said before does still apply.

@klmckinney did some direct database patching outside of Qiqqa; that approach might be useful for this feature, but we'll have to look into it further.

GerHobbelt commented 3 years ago

@klmckinney : (also reply to your other comment elsewhere): yes, could be very useful backend work in qiqqa's new mupdf (under development)

To start that without throwing you into the shark water immediately, let's do this in smaller steps: first get your current work on Qiqqa checked into another repository you own yourself (so you don't have to deal with the large Qiqqa repo, which includes git submodules, etc. so you can get acquainted with git more smoothly - nothing difficult there but slow descent is probably better than deep dive.)

Next would be getting mupdf loaded and built on your dev machine and then have a look at mutool metadump et al, as that'll be the starting point of my new way to obtain all sorts of metadata (including annotations) from all the PDFs out there).

Easiest right now for me would be call via WhatsApp. My mobile number floats around on the internet anyway, so +31 six one one 120 978 is where you can connect with. Might be handy to send me an email at ger@hobbelt.com to share numbers and a schedule when you're available for voice call / chat perhaps? (I am in Amsterdam Timezone, by the way.)