Open jymchng opened 1 year ago
@J-F-Liu Hi J-F-Liu, just thinking about this replace_text
method that returns a Result<()>
- it means there is a contract between the caller and callee such that if replace_text
indeed does replace the text in the .pdf
, it returns an Ok(())
, else it returns an Err variant.
For this function, particularly on Line 138, it seems that the function does nothing when the encoding
is not within the pre-defined 'able-to-parse' encodings ("Tf" or "Tj"), the match arm _ => {}
evaluates to an empty scope. Would it be better to return an Err
so that the caller knows it is not getting what the function promises to do because it is unable to parse any other type of encodings?
Yes, text processing is not implemented completely.
@J-F-Liu Hi Liu, do you think this issue can be fixed?
Theoretically this can be fixed, but sadly, extracting text from a PDF is hard. The solution for #125 may lay a first foundation for solving this issue, as it will allow to (sometimes) extract the text from the PDF. However, this is only half of the solution, we would still need to implement putting the replacement text back into the PDF.
Logs
Apparently, directly replacing text in a page doesn't work?