bibtex 里的保留字符

PaleNeutron commented 2 years ago

目前我发现有不少人喜欢在abstract字段写百分号，再就是在note字段写东西，其他字段也有出现带query的url的

这些都会导致保留字符[_%$&]出现在thesis.bbl文件中，进而导致编译错误，latexmk脚本里能否添加对这些字符的预处理？

skyzh commented 2 years ago

一般应该自己 escape 掉吧…… 比较好奇 latexmk 可以处理这个事情吗？

PaleNeutron commented 2 years ago

我目前是自己写脚本escape掉的，但是查这个问题查了很久，因为bbl文件的错误只会报在main.tex里的printbibliography处。

我也不知道为啥bbl文件会包含abstract和note字段...

我看bcf文件里有如下内容

      <bcf:map map_overwrite="1" map_foreach="title,booktitle,journaltitle,journal,publisher,address,location,institution,organization">
        <bcf:map_step map_field_source="$MAPLOOP" map_match="([^\\])\#" map_replace="$1\\\#"/>
      </bcf:map>
      <bcf:map map_overwrite="1" map_foreach="title,booktitle,journaltitle,journal,publisher,address,location,institution,organization">
        <bcf:map_step map_field_source="$MAPLOOP" map_match="([^\\])\%" map_replace="$1\\\%"/>
      </bcf:map>
      <bcf:map map_overwrite="1" map_foreach="title,booktitle,journaltitle,journal,publisher,address,location,institution,organization">
        <bcf:map_step map_field_source="$MAPLOOP" map_match="([^\\])\x26" map_replace="$1\\\x26"/>
      </bcf:map>

似乎有regex替换的功能，就是不知道怎么配置。。。

hushidong commented 2 years ago

不用abstract就行。

导言区加条命令：\AtEveryBibitem{\clearfield{abstract}}

hushidong commented 2 years ago

或者：

\DeclareStyleSourcemap{
    \maps[datatype=bibtex]{
        \map{
        \step[fieldset=abstract, null]
        }
    }
}

PaleNeutron commented 2 years ago

发现虽然可以去掉abstract和note，但是不知道为什么note的内容会出现在bbl文件的\keyw段：例如

\keyw{knowledge graphs; multi-event forecasting; word graphs,Conference Proceedings ER - http://www.syndetics.com/index.aspx?isbn=9781450379984/sc.gif&client=summontrial&freeimage=true;http://www.syndetics.com/index.aspx?isbn=9781450379984/mc.gif&client=summontrial&freeimage=true;http://www.syndetics.com/index.aspx?isbn=9781450379984/lc.gif&client=summontrial&freeimage=true;}

我的setup.tex

% 导入参考文献数据库
% 去除abstract和note
\AtEveryBibitem{\clearfield{abstract}}
\AtEveryBibitem{\clearfield{note}}
\DeclareStyleSourcemap{
    \maps[datatype=bibtex]{
        \map{
        \step[fieldset=abstract, null]
        \step[fieldset=note, null]
        }
    }
}

\addbibresource{bibdata/thesis.bib}
\addbibresource{bibdata/misc.bib}

hushidong commented 2 years ago

应该不会出现这种情况的，给出现问题的这一篇文献的bib条目内容看一下？

zepinglee commented 2 years ago

LaTeX 的保留字符一般应该在导出 bib 时进行 escape 处理。另外 url 域的 % 一般不 escape，因为会在 \url 命令中使用。

PaleNeutron commented 2 years ago

直接从note express导出的。。。唉，这类玩意儿太不靠谱了，这个和endnote都不会在导出的时候escape。

@inproceedings{
DengRangwala-590,
   Author = {Deng, Songgaojun and Rangwala, Huzefa and Ning, Yue},
   Title = {Dynamic Knowledge Graph based Multi-Event Forecasting},
   BookTitle = {Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining},
   Series= {Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining},
   Publisher = {ACM},
   Pages = {1585-1595},
   Note  = {Conference Proceedings
ER  -
http://www.syndetics.com/index.aspx?isbn=9781450379984/sc.gif&client=summontrial&freeimage=true;http://www.syndetics.com/index.aspx?isbn=9781450379984/mc.gif&client=summontrial&freeimage=true;http://www.syndetics.com/index.aspx?isbn=9781450379984/lc.gif&client=summontrial&freeimage=true;},
   Abstract = {Modeling concurrent events of multiple types and their involved actors from open-source social sensors is an important task for many domains such as health care, disaster relief, and financial analysis. Forecasting events in the future can help human analysts better understand global social dynamics and make quick and accurate decisions. Anticipating participants or actors who may be involved in these activities can also help stakeholders to better respond to unexpected events. However, achieving these goals is challenging due to several factors: (i) it is hard to filter relevant information from large-scale input, (ii) the input data is usually high dimensional, unstructured, and Non-IID (Non-independent and identically distributed) and (iii) associated text features are dynamic and vary over time. Recently, graph neural networks have demonstrated strengths in learning complex and relational data. In this paper, we study a temporal graph learning method with heterogeneous data fusion for predicting concurrent events of multiple types and inferring multiple candidate actors simultaneously. In order to capture temporal information from historical data, we propose Glean, a graph learning framework based on event knowledge graphs to incorporate both relational and word contexts. We present a context-aware embedding fusion module to enrich hidden features for event actors. We conducted extensive experiments on multiple real-world datasets and show that the proposed method is competitive against various state-of-the-art methods for social event prediction and also provides much-need interpretation capabilities.},
   Keywords = {knowledge graphs; multi-event forecasting; word graphs},
   Year = {2020} }

zepinglee commented 2 years ago

直接从note express导出的。。。唉，这类玩意儿太不靠谱了，这个和endnote都不会在导出的时候escape。

可以考虑 Zotero，还可以自己改 javascript 实现的导出功能。

PaleNeutron commented 2 years ago

直接从note express导出的。。。唉，这类玩意儿太不靠谱了，这个和endnote都不会在导出的时候escape。

可以考虑 Zotero，还可以自己改 javascript 实现的导出功能。

不用Zotero是因为没法自动更新中文文献的元数据，知网/百度/Google导出的bibtex都不包含doi，非常难受

hushidong commented 2 years ago

问题找到了，是我很早以前的代码所导致的，本来是复制信息用来做过滤器的，方便做过滤判断的。

不过现在应该是不用了的，但一直没有改掉。要不你发现了，估计会一直存在。后面我会改掉。

不过昨天刚更新ctan，不便于频繁更新，只在github上更新了。

但其实不影响使用啊。你上面的条文献，只有series里面的&符号会出错，booktitle里头的&，样式文件已经自动处理了的。再加上

\AtEveryBibitem{\clearfield{abstract,note}}

就不会出错了。比如：

\documentclass[a4paper,zihao=-4, linespread=1.67]{ctexart}

\usepackage[backend=biber,style=gb7714-2015]{biblatex}%,erjpunctcn=false,erjcitepunctcn=false

\AtEveryBibitem{\clearfield{abstract,note}}

\usepackage{filecontents}
\begin{filecontents}{\jobname.bib}
@inproceedings{
DengRangwala-590,
   Author = {Deng, Songgaojun and Rangwala, Huzefa and Ning, Yue},
   Title = {Dynamic Knowledge Graph based Multi-Event Forecasting},
   BookTitle = {Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining},
   Series= {Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
   Publisher = {ACM},
   Pages = {1585-1595},
   Note  = {Conference Proceedings
ER  -
http://www.syndetics.com/index.aspx?isbn=9781450379984/sc.gif&client=summontrial&freeimage=true;http://www.syndetics.com/index.aspx?isbn=9781450379984/mc.gif&client=summontrial&freeimage=true;http://www.syndetics.com/index.aspx?isbn=9781450379984/lc.gif&client=summontrial&freeimage=true;},
   Abstract = {Modeling concurrent events of multiple types and their involved actors from open-source social sensors is an important task for many domains such as health care, disaster relief, and financial analysis. Forecasting events in the future can help human analysts better understand global social dynamics and make quick and accurate decisions. Anticipating participants or actors who may be involved in these activities can also help stakeholders to better respond to unexpected events. However, achieving these goals is challenging due to several factors: (i) it is hard to filter relevant information from large-scale input, (ii) the input data is usually high dimensional, unstructured, and Non-IID (Non-independent and identically distributed) and (iii) associated text features are dynamic and vary over time. Recently, graph neural networks have demonstrated strengths in learning complex and relational data. In this paper, we study a temporal graph learning method with heterogeneous data fusion for predicting concurrent events of multiple types and inferring multiple candidate actors simultaneously. In order to capture temporal information from historical data, we propose Glean, a graph learning framework based on event knowledge graphs to incorporate both relational and word contexts. We present a context-aware embedding fusion module to enrich hidden features for event actors. We conducted extensive experiments on multiple real-world datasets and show that the proposed method is competitive against various state-of-the-art methods for social event prediction and also provides much-need interpretation capabilities.},
   Keywords = {knowledge graphs; multi-event forecasting; word graphs},
   Year = {2020} }
\end{filecontents}
\addbibresource{\jobname.bib}

\begin{document}
\nocite{*}

\printbibliography
\end{document}

结果为：

hushidong commented 2 years ago

或者使用

\DeclareSourcemap{
    \maps[datatype=bibtex]{
        \map{
        \step[fieldset=abstract, null]
        \step[fieldset=note, null]
        }
    }
}

则可以完全消除这个影响，注意这里是\DeclareSourcemap而不是前面的stylesourcemap。该命令层次更早，所以可以去掉相关的域。

\documentclass[a4paper,zihao=-4, linespread=1.67]{ctexart}

\usepackage[backend=biber,style=gb7714-2015]{biblatex}%,erjpunctcn=false,erjcitepunctcn=false

\DeclareSourcemap{
    \maps[datatype=bibtex]{
        \map{
        \step[fieldset=abstract, null]
        \step[fieldset=note, null]
        }
    }
}

\usepackage{filecontents}
\begin{filecontents}{\jobname.bib}
@inproceedings{
DengRangwala-590,
   Author = {Deng, Songgaojun and Rangwala, Huzefa and Ning, Yue},
   Title = {Dynamic Knowledge Graph based Multi-Event Forecasting},
   BookTitle = {Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining},
   Series= {Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
   Publisher = {ACM},
   Pages = {1585-1595},
   Note  = {Conference Proceedings
ER  -
http://www.syndetics.com/index.aspx?isbn=9781450379984/sc.gif&client=summontrial&freeimage=true;http://www.syndetics.com/index.aspx?isbn=9781450379984/mc.gif&client=summontrial&freeimage=true;http://www.syndetics.com/index.aspx?isbn=9781450379984/lc.gif&client=summontrial&freeimage=true;},
   Abstract = {Modeling concurrent events of multiple types and their involved actors from open-source social sensors is an important task for many domains such as health care, disaster relief, and financial analysis. Forecasting events in the future can help human analysts better understand global social dynamics and make quick and accurate decisions. Anticipating participants or actors who may be involved in these activities can also help stakeholders to better respond to unexpected events. However, achieving these goals is challenging due to several factors: (i) it is hard to filter relevant information from large-scale input, (ii) the input data is usually high dimensional, unstructured, and Non-IID (Non-independent and identically distributed) and (iii) associated text features are dynamic and vary over time. Recently, graph neural networks have demonstrated strengths in learning complex and relational data. In this paper, we study a temporal graph learning method with heterogeneous data fusion for predicting concurrent events of multiple types and inferring multiple candidate actors simultaneously. In order to capture temporal information from historical data, we propose Glean, a graph learning framework based on event knowledge graphs to incorporate both relational and word contexts. We present a context-aware embedding fusion module to enrich hidden features for event actors. We conducted extensive experiments on multiple real-world datasets and show that the proposed method is competitive against various state-of-the-art methods for social event prediction and also provides much-need interpretation capabilities.},
   Keywords = {knowledge graphs; multi-event forecasting; word graphs},
   Year = {2020} }
\end{filecontents}
\addbibresource{\jobname.bib}

\begin{document}
\nocite{*}

\printbibliography
\end{document}

产生的bbl文件为：

% $ biblatex auxiliary file $
% $ biblatex bbl format version 3.1 $
% Do not modify the above lines!
%
% This is an auxiliary file used by the 'biblatex' package.
% This file may safely be deleted. It will be recreated by
% biber as required.
%
\begingroup
\makeatletter
\@ifundefined{ver@biblatex.sty}
  {\@latex@error
     {Missing 'biblatex' package}
     {The bibliography requires the 'biblatex' package.}
      \aftergroup\endinput}
  {}
\endgroup

\refsection{0}
  \datalist[entry]{none/global//global/global}
    \entry{DengRangwala-590}{inproceedings}{}
      \name{author}{3}{}{%
        {{hash=e7a05629ab5ed8a397d8eeb48984fb4e}{%
           family={Deng},
           familyi={D\bibinitperiod},
           given={Songgaojun},
           giveni={S\bibinitperiod}}}%
        {{hash=0ff90664306811aa5274b0953c6c1e09}{%
           family={Rangwala},
           familyi={R\bibinitperiod},
           given={Huzefa},
           giveni={H\bibinitperiod}}}%
        {{hash=9e5d3130b7d925d863f1cf0a3e0b8f7e}{%
           family={Ning},
           familyi={N\bibinitperiod},
           given={Yue},
           giveni={Y\bibinitperiod}}}%
      }
      \name{namea}{3}{}{%
        {{hash=e7a05629ab5ed8a397d8eeb48984fb4e}{%
           family={Deng},
           familyi={D\bibinitperiod},
           given={Songgaojun},
           giveni={S\bibinitperiod}}}%
        {{hash=0ff90664306811aa5274b0953c6c1e09}{%
           family={Rangwala},
           familyi={R\bibinitperiod},
           given={Huzefa},
           giveni={H\bibinitperiod}}}%
        {{hash=9e5d3130b7d925d863f1cf0a3e0b8f7e}{%
           family={Ning},
           familyi={N\bibinitperiod},
           given={Yue},
           giveni={Y\bibinitperiod}}}%
      }
      \list{language}{1}{%
        {english}%
      }
      \list{publisher}{1}{%
        {ACM}%
      }
      \strng{namehash}{785787e420e9d44558be86878e44c1fb}
      \strng{fullhash}{d9a91d212203f11f38f91c301aa83419}
      \strng{bibnamehash}{d9a91d212203f11f38f91c301aa83419}
      \strng{authorbibnamehash}{d9a91d212203f11f38f91c301aa83419}
      \strng{authornamehash}{785787e420e9d44558be86878e44c1fb}
      \strng{authorfullhash}{d9a91d212203f11f38f91c301aa83419}
      \strng{nameabibnamehash}{d9a91d212203f11f38f91c301aa83419}
      \strng{nameanamehash}{785787e420e9d44558be86878e44c1fb}
      \strng{nameafullhash}{d9a91d212203f11f38f91c301aa83419}
      \field{sortinit}{1}
      \field{sortinithash}{4f6aaa89bab872aa0999fec09ff8e98a}
      \field{labelnamesource}{author}
      \field{labeltitlesource}{title}
      \field{booktitle}{Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining}
      \field{langid}{english}
      \field{languageid}{english}
      \field{lansortorder}{4}
      \field{series}{Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining}
      \field{title}{Dynamic Knowledge Graph based Multi-Event Forecasting}
      \field{usera}{C}
      \field{userd}{english}
      \field{userf}{english}
      \field{year}{2020}
      \field{dateera}{ce}
      \true{nocite}
      \field{pages}{1585\bibrangedash 1595}
      \range{pages}{11}
      \keyw{knowledge graphs; multi-event forecasting; word graphs}
    \endentry
  \enddatalist
\endrefsection
\endinput

sjtug / SJTUThesis

bibtex 里的保留字符 #770