Open bhpayne opened 3 years ago
Symbol definitions, if not in the paper itself, might be in cited papers (use bibliographic citation tracing)
xz -d HEP_TEX.model.xz
pip install requirements.txt
python resolve_symbol_definitions.py tex_file
The script tries to map all variable names to their definition(s)/and or properties in the file.
currently maps 10-30% of definitions otherwise creates a Concordance dictionary where every sentence that uses the variable is in a dictionary of lists.
I have found there are roughly 50k variables used in HEP.
Many of which do not use the same definition.
Is HEP_TEX.model.xz
in the git repo?
The file HEP_TEX.model.xz was removed.
I will update the python files to use the results from scanner.out for word tokenization.
The first pass at resolving symbol defintions can be found in the utils directory
run make variable_definitions in the utils directory
the results look like
<:, the fine structure constant $\alpha$:>
<:and the proton-to-electron mass ratio $\frac{m_p}{m_e}$:>
<:the upper bound for the speed of sound in condensed phases, $v_u$:>
<:We find that $\frac{v_u}{c}=\alpha\left(\frac{m_e}{2m_p}\right)^{\frac{1}{2}}$:>
...
The results from the python version can be found in https://github.com/allofphysicsgraph/latex-in-arxiv/blob/master/symbol_definitions
Suppose we can pick out math symbols from all the papers.