facebookresearch / nougat

Implementation of Nougat Neural Optical Understanding for Academic Documents
https://facebookresearch.github.io/nougat/
MIT License
8.81k stars 560 forks source link

pdf to .tex and not .mmd #195

Open pkozlows opened 8 months ago

pkozlows commented 8 months ago

I have converted a PDF to a .mmd file using nougat, and I am not exactly familiar with the .mmd syntax, but the result looks pretty promising. However, when I tried to convert this .mmd to .tex with pandoc, it failed to compile because I think it requires some dependencies that I don't have. Here is an example:

.mmd

(14 pts.) Consider a region within a fluid described by the van der Waals equation [\beta p=\frac{\rho}{1-b\rho}-\beta a\rho^{2}, \tag{1}] where (\rho=\langle N\rangle/V). The volume of the region is (L^{3}). Due to the spontaneous fluctuations in the system, the instantaneous value of the density in that region can differ from its average by an amount (\delta\rho). Determine, as a function of (\beta), (\rho), (a), (b), and (L^{3}), the typical relative size of these fluctuations; that is, evaluate (\langle(\delta\rho)^{2}\rangle^{1/2}/\rho). Demonstrate that when one considers observations of a macroscopic system (i.e., the size of the region becomes macroscopic, (L^{3}\rightarrow\infty)) the relative fluctuations become negligible.

.tex

(14 pts.) Consider a region within a fluid described by the van der Waals equation {[}\beta p=\frac{\rho}{1-b\rho}-\beta a\rho\^{}{2}, \tag{1}{]} where (\rho=\langle N\rangle/V). The volume of the region is (L\^{}{3}). Due to the spontaneous fluctuations in the system, the instantaneous value of the density in that region can differ from its

in addition, it spits out a latex preamble that looks kind of overly complicated:

% Options for packages loaded elsewhere \PassOptionsToPackage{unicode}{hyperref} \PassOptionsToPackage{hyphens}{url} % \documentclass[ ]{article} \usepackage{amsmath,amssymb} \usepackage{iftex} \ifPDFTeX \usepackage[T1]{fontenc} \usepackage[utf8]{inputenc} \usepackage{textcomp} % provide euro and other symbols \else % if luatex or xetex \usepackage{unicode-math} % this also loads fontspec \defaultfontfeatures{Scale=MatchLowercase} \defaultfontfeatures[\rmfamily]{Ligatures=TeX,Scale=1} \fi \usepackage{lmodern} \ifPDFTeX\else % xetex/luatex font selection \fi % Use upquote if available, for straight quotes in verbatim environments \IfFileExists{upquote.sty}{\usepackage{upquote}}{} \IfFileExists{microtype.sty}{% use microtype if available \usepackage[]{microtype} \UseMicrotypeSet[protrusion]{basicmath} % disable protrusion for tt fonts }{} \makeatletter \@ifundefined{KOMAClassName}{% if non-KOMA class

is there a batter way to do all of this? I am interested in converting a PDF from the instructions of a problem set to a .tex file that I can later manipulate.

pkozlows commented 8 months ago

the multimarkdown cli tool also gives something strange, but cleaner

(14 pts.) Consider a region within a fluid described by the van der Waals equation [\textbackslash{}beta p=\textbackslash{}frac{\textbackslash{}rho}{1-b\textbackslash{}rho}-\textbackslash{}beta a\textbackslash{}rho\^{}{2}, \textbackslash{}tag{1}] where (\textbackslash{}rho=\textbackslash{}langle N\textbackslash{}rangle\slash{}V). The volume of the region is (L\^{}{3}). Due to the spontaneous fluctuations in the system, the instantaneous value of the density in that region can differ from its average by an amount (\textbackslash{}delta\textbackslash{}rho). Determine, as a function of (\textbackslash{}beta), (\textbackslash{}rho), (a), (b), and (L\^{}{3}), the typical relative size of these fluctuations; that is, evaluate (\textbackslash{}langle(\textbackslash{}delta\textbackslash{}rho)\textsuperscript{{2}\textbackslash{}rangle}{1\slash{}2}\slash{}\textbackslash{}rho). Demonstrate that when one considers observations of a macroscopic system (\emph{i.e.}, the size of the region becomes macroscopic, (L\^{}{3}\textbackslash{}rightarrow\textbackslash{}infty)) the relative fluctuations become negligible.

pkozlows commented 8 months ago

for reference, i will post the pdf that i am trying to convert into .tex ChChE_164_2024_HW3.pdf

hbghlyj commented 2 months ago

the ctan.org/pkg/markdown package might help