DFTi / Scribbeo

Scribbeo on iOS (Available on App Store)
1 stars 0 forks source link

Fix text formatting in PDF files #17

Closed zedsaeed closed 12 years ago

zedsaeed commented 12 years ago

Text is not formatting correctly in the PDF files. Please fix because shows may be writing lots of notes.

mixflame commented 12 years ago

Addendum: always show all notes never attach a PDF style

zedsaeed commented 12 years ago

Jon: On the surface it may be easier to ignore the old pdf code and simply have the PDF generated on the server. But there's a major issue with this approach: keep in mind that Scribbeo also has to work in "local" mode. This is where we add clips into Scribbeo via iTunes and mark them. (That is, in local mode, there is no server.) So, it may be worth doing it the long and hard way of decoding the old PDF-making code and fixing it. Let me know your thoughts.

mixflame commented 12 years ago

I tried this approach first. It's bad because Core Text itself is able to wrap text correctly, a functionality Steve Kochan tried to reinvent. HTML2PDF conversion is definitely the best option that will flow and allow complete use of the desired functionality, it will be better. Currently there are two ways to do this, an offline and an online. The offline approach is to try to integrate OCPDFGen which I have been trying unsuccessfully (but it's still possible.) The online method would to be to just have the server generate the PDF for us. There is still a third offline option as well, generate the PDF from HTML using inbuilt WebView techniques. I agree that offline is the best option, but repairing Steve Kochan's methods is not very wise, simply because they are incorrectly built in the first place. HTML2PDF is a trusted solution in this regard that will fix all formatting issues. Though it may be a safe assumption to make that if they are emailing a PDF, they are online. My favored approach would be to allow it all to happen offline, though, to boost product value. But Yeah Kochan's code here is too screwy, a fix is near impossible. He literally just went ahead and did it completely wrong. This isn't the usual case, but changing the PDF system is literally the best option here.

mixflame commented 12 years ago

Before I go too far here though, I would like to test some of our generated HTML in a standalone PDF generation solution for iOS. Then we can be sure that the library implemented will suit our needs. I'll keep you posted.

zedsaeed commented 12 years ago

J: I totally get what you are saying. Whichever way you go, just remember this: there's a whole lot of people using Scribbeo in offline mode. They like being able to load clips into their ipad and make notes on it. Then they airprint it, or if they are near a wifi, they send off a pdf. So, bear in mind that there are a lot of situations where Scribbeo users are not using our server. In fact, I would say that that is a majority of the cases.. However we generate a pdf, it has to be done within our app and ios of the idevice.

zedsaeed commented 12 years ago

Jon: Another note on this issue, for clarification. When Ramy, you and I were meeting, we had come to an understanding. That being this: Let's just gray and mute the microphone button when working in local mode. That way the feature of making audio notes is _only_ available when working with a server. The advantage this creates is that we never have to embed audio into the pdf, which would be needed if we allowed audio notes in local mode. (The pdf would become too big if we allowed for audio embedding.) And, if people make audio notes in network mode, all we have to do is simply embed the link into pdf, for the audio file that is sitting on a server somewhere.

mixflame commented 12 years ago

in progress

mixflame commented 12 years ago

Zed, another comment. You are indeed correct. Kochan's code may be bad but it sorta does work. I was able to finally make CocoaPods dependency manager for Xcode work. And then require in OCPDFGen from it's Github and use test the code within Scribbeo. However, when looking at the generated document, it does not contain the Base64 encoded images as I would expect. It might be possible that I can find some code that will allow placing the image even with this encoding, solving images and document flow. The reason for this encoding is that it allows to place the image in the page, but this simplistic HTML2PDF library doesn't support it. Other option is to use CocoaPods, which we now know to work 100% to require in any other code off of Github that can solve this. It may be more worthwhile to just fix his code, or find a library which does all of this and use that.

kfatehi commented 12 years ago

Sounds like progress. If you get images to work in that lib, make sure you respect the license if it requires you to contrib back.

mixflame commented 12 years ago

Tonight I tried for hours to get the inline images working in the converted HTML. Something's up. I will have to ask the authors of DTCoreText because it is supposed to support base64 images which we use. Anyway, something worth nothing is that when I changed the thumbnail storage type to PNG, off of JPEG, the image quality of the images seemed to go up. Just need a little bit more and will we have a flowing PDF generator that looks the same as the HTML. It needs a second look. Stopping for tonight. May try alternative approach tmw.

kfatehi commented 12 years ago

Cool keep up the good work... try to enjoy your weekend too :P

mixflame commented 12 years ago

update: didnt get anywhere on OCPDFGen images

however lots of success of cocoapods and learning how to #import whatever you want off of Github without that much hassle.. kinda scary how easy it really is with Eloy's new tool

started making a newever of OCPDFGen by updating DTCoreText, but even after fixing that, didn't get base64 images. may wanna email the author and ask, since DTCoreText is actively being worked on. but for now, its a better idea to switch back to Kochan's code and fix.

zedsaeed commented 12 years ago

Thanks Jon.

Does this include the formatting fix? Or are you only focusing on the image quality in pdf issue?

Zed

From: Jon Silverman notifications@github.com<mailto:notifications@github.com> Reply-To: keyvanfatehi/Scribbeo reply@reply.github.com<mailto:reply@reply.github.com> Date: Mon, 20 Aug 2012 14:06:41 -0400 To: keyvanfatehi/Scribbeo Scribbeo@noreply.github.com<mailto:Scribbeo@noreply.github.com> Cc: Zed Saeed zed@digitalfilmtree.com<mailto:zed@digitalfilmtree.com> Subject: Re: [Scribbeo] Fix text formatting in PDF files (#17)

bout to go to DFT.. no coffee here and hungry.. didnt get anywhere on PDF images..

however lots of success of cocoapods and learning how to #import whatever you want off of Github without that much hassle.. kinda scary how easy it really is with Eloy's new tool

started making a newever of OCPDFGen by updating DTCoreText, but even after fixing that, didn't get base64 images. may wanna email the author and ask, since DTCoreText is actively being worked on. but for now, its a better idea to switch back to Kochan's code and fix.

— Reply to this email directly or view it on GitHubhttps://github.com/keyvanfatehi/Scribbeo/issues/17#issuecomment-7877883.

mixflame commented 12 years ago

if it worked, it would fix formatting and image quality. however, through this, i found a seperate fix for image quality, changing to PNG from JPEG. the reason it would fix formatting is because HTML2PDF conversion respects document flow, and Kochan's techniques just attempts to smartly draw on the page and breaks the way we know it to. however, when i switched to OCPDFGen library, which uses DTCoreText, which is supposed to support base64 images in HTML, which we use.. the images don't display. one option is for me to ask the creator of DTCoreText for help, which is an actively maintained project. other option is just to fix Kochan's code and backport the PNG change, which seems to up quality some. the way i checked was just looking at HTML output.

mixflame commented 12 years ago

also, besides this i am going to try another, better approach. detailed here: http://www.ioslearner.com/tag/iphone-convert-html-to-pdf/

this approach would make the PDF document match 100% but it would just be an image: no text, no links and no audio notation. if used this method, can get image quality real high and document flow (text formatting) to work 100%

zedsaeed commented 12 years ago

Jon:

FYI: it is a huge drawback to have a pdf just as an image. The ability to be able to copy/paste text from pdfs is very useful. (Not to mention the links!)

Z

From: Jon Silverman notifications@github.com<mailto:notifications@github.com> Reply-To: keyvanfatehi/Scribbeo reply@reply.github.com<mailto:reply@reply.github.com> Date: Mon, 20 Aug 2012 15:01:41 -0400 To: keyvanfatehi/Scribbeo Scribbeo@noreply.github.com<mailto:Scribbeo@noreply.github.com> Cc: Zed Saeed zed@digitalfilmtree.com<mailto:zed@digitalfilmtree.com> Subject: Re: [Scribbeo] Fix text formatting in PDF files (#17)

also, besides this i am going to try another, better approach. detailed here: http://www.ioslearner.com/tag/iphone-convert-html-to-pdf/

this approach would make the PDF document match 100% but it would just be an image: no text, no links and no audio notation. if used this method, can get image quality real high and document flow (text formatting) to work 100%

— Reply to this email directly or view it on GitHubhttps://github.com/keyvanfatehi/Scribbeo/issues/17#issuecomment-7879694.

mixflame commented 12 years ago

yes, agreed. I won't try that approach. Right now I am looking into C++ PDF libraries that can do it. only other thing is just API-ize it and make it online-only. http://joliprint.com/ is a choice for the conversion. if users are in the middle of Kansas and need a document type, surely HTML is enough until they can get to Starbucks? otherwise gonna have to fix Kochan's method somehow.

zedsaeed commented 12 years ago

Yeah…it may be the path of least resistance (but more headaches) to fix Kochan's code.

Z

P.S I came up with a new term "Kochanic Code." I think you know what that means.

From: Jon Silverman notifications@github.com<mailto:notifications@github.com> Reply-To: keyvanfatehi/Scribbeo reply@reply.github.com<mailto:reply@reply.github.com> Date: Mon, 20 Aug 2012 15:10:37 -0400 To: keyvanfatehi/Scribbeo Scribbeo@noreply.github.com<mailto:Scribbeo@noreply.github.com> Cc: Zed Saeed zed@digitalfilmtree.com<mailto:zed@digitalfilmtree.com> Subject: Re: [Scribbeo] Fix text formatting in PDF files (#17)

yes, agreed. I won't try that approach. Right now I am looking into C++ PDF libraries that can do it. only other thing is just API-ize it and make it online-only. http://joliprint.com/ is a choice for the conversion. if users are in the middle of Kansas and need a document type, surely HTML is enough until they can get to Starbucks? otherwise gonna have to fix Kochan's method somehow.

— Reply to this email directly or view it on GitHubhttps://github.com/keyvanfatehi/Scribbeo/issues/17#issuecomment-7879986.

mixflame commented 12 years ago

need a break from this. the selector's in saveToPDF. can I get a second set of eyes? it may be better for me to move on from here since I have a mental block on this one. Keyvan?

zedsaeed commented 12 years ago

Jon: My suggestion would be to ignore this problem for 24 hrs and work on some of the critical issues I created for Scribbeo. But yes, feel free to get a second opinion.

kfatehi commented 12 years ago

You know what I need in order for me to be able to help you. In case logic fails you here are some options:

Take a break

mixflame commented 12 years ago

10-4

mixflame commented 12 years ago

two problems here

  1. the PDF code draws an exact boundary box around the note which does not "flow" or stetch for long notes.
  2. the word wrapping code is correct, but will still write across this boundary
,he,gjc,jug,jug,jtcmhfc gjc gjc Greg
Greg handing gjc fdutmdrydmhmtngrnsngrsyrsmhd,gj fdutmdrydmhmtngrnsngrsyrsmhd,gjc
had jtcmhfc jtcmhfc,jtcmhfc jtcmhfc
jtcmhfc ,dft xgjcfm.vncv.gxfdznyrznteabteanyr,guy.uog xgjcfm.vncv.gxfdznyrznteabteanyr,guy.uog,yc.iyg,tj.lu gvfhdthxthtdhdththxhxrhdhdhdthrshrsgrzfezfesfesgrxg gvfhdthxthtdhdththxhxrhdhdhdthrshrsgrzfezfesfesgrxg

Only reason this comment doesn't work is big word sizes. This doesn't really happen in English language usually, making this kind of a non-issue. Formatting can only truely be fixed by a PDF solution that respects Document Flow, which the current one does not. However, this can be completely resolved with a maximum note size limit.

Only other solution really is to completely revamp the PDF code like I tried before, but with a library that actually works.

Since this may be hard or impossible to embed on iOS, we should probably make the PDF generation a web service, and just provide HTML in Offline mode.

Moving on from this, been on it for weeks. I will let someone else make the decision here, but obviously I get that Maximum Note Sizes aren't what we want.

mixflame commented 12 years ago

another problem.

tried changing the lines below the text to expand, but the way its designed is to fit 4 fixed height notes on a page

the code can't really be changed to flow by any means. it will need to be moved to serverside and this ticket will have to be closed in a different way. this is not going to be made better the way it is.

the only viable solutions are serverside HTML2PDF rendering, or maximum note size with this generator. internal HTML2PDF and modding this code don't work. i wouldn't leave this generator the way it is, but it cannot be modified to make the notes border flow the way it is designed.

kfatehi commented 12 years ago
if lines_to_write > lines_that_fit_in_text_box {
  pdf_being_created.insert_after_current_page(single-page-version)
  return "This note is oversized and has been printed on a later page."
}

Get it?

kfatehi commented 12 years ago

the only viable solutions are serverside HTML2PDF already decided that this is not viable. Reopened issue. Just unassign it from yourself if you are giving up. You can reassign it later, but this is not solved.

kfatehi commented 12 years ago

the only viable solutions are serverside HTML2PDF rendering or maximum note size with this generator. internal HTML2PDF and modding this code don't work [yeah this a hard problem, it requires original code]. i wouldn't leave this generator the way it is, but it cannot be modified to make the notes border flow the way it is designed. [i have to disagree, see pseudocode above]

By not fixing this we're basically saying "You cannot have more than this to say about this frame that you drew on" ... essentially limiting possible creative use of the product. Why not replace the generator, it sounded like here you convinced yourself out of rewriting it proper? I know you can do it!

mixflame commented 12 years ago

I believe I can too, just sick of the particular problem. Gonna hit other parts of the codebase to add more value. I agree with you that we need a clean in-house solution. Willing to jump back to this after a couple other tickets. My favored approach would be to embed some C/C++/ObjC library that can do it better than OCPDFGen, and just CocoaPods and use it. I really think this approach would come up with the best results.

zedsaeed commented 12 years ago

Jon,

Definitely go off this PDF issue for a day or so.

Pickup another critical or bug ticket I created for Scribbeo.

Zed

On Aug 22, 2012, at 9:23 AM, "Jon Silverman" notifications@github.com<mailto:notifications@github.com> wrote:

don't wanna work on this specific one anymore for now, you can have a look if you'd like, but i'm exhausted from it. moving on. if you try this you can ask me anything you want, you'll see what i'm saying about the brittleness of this part of the codebase

— Reply to this email directly or view it on GitHubhttps://github.com/keyvanfatehi/Scribbeo/issues/17#issuecomment-7939991.

kfatehi commented 12 years ago

This is clearly not going to get done until its code is reached in the upcoming major refactor...

Closing this ticket to make a fresh one -- terrible signal to noise ratio here