pkubowicz / opendetex

Improved version of Detex - tool for extracting plain text from TeX and LaTeX sources
Other
236 stars 35 forks source link

Does detex allow me to ignore newline that exist in the middle of the line? #44

Closed avatar-lavventura closed 5 years ago

avatar-lavventura commented 6 years ago

For example I am writing a latex document.

detex convert the line as:

scientists
for decades ...

Original should be: scientists for decades ...

Since there is a new line on the original .tex document its output like that (having a newline).

[Q] Is there any way to ignore that newline that exists in the middle of a sentences?

pkubowicz commented 6 years ago

No, no such functionality exists. detex was intended primarily to allow spell-checking TeX sources, not to provide a nicely formatted plain text output.

avatar-lavventura commented 6 years ago

I guess my option is to replace \n with empty space ' '. tr '\n' ' ' < outputFile.txt @pkubowicz

pauloney commented 5 years ago

@avatar-lavventura, I am not really sure what you are trying to accomplish here. If you tell us more we would be able to recommend something more appropriate.

Converting a newline to a space will certainly "damage" most output of DeTeX because it will convert all of it in a very long-line. DeTeX has no knowledge of where a Chapter/Section starts so those will become just something lost in the middle of a line.

If you are looking for something that will produce a nice ASCII format for you text look into processing, and then using pdftotext to obtain a nice ASCII output.