clear-datacenter / plan

MIT License
45 stars 17 forks source link

xps/pdf/png/json转换 #18

Open wanghaisheng opened 7 years ago

wanghaisheng commented 7 years ago

http://www.jacobfenton.com/

I’m a journalist and software developer based in Portland, Oregon. I've spent the last decade working as a reporter, editor, and programmer in newsrooms and nonprofits in the U.S.

During the 2015-16 academic year I was a John S. Knight Journalism Fellow at Stanford University researching ways to make complex document processing affordable to reporters. I’m especially interested in turning unstructured images into data, and building tools to mine actionable news tips from some of the dullest corners of the web. You can read more about that project here.

Previously I was editorial engineer at The Sunlight Foundation, where I worked extensively on campaign finance, TV ad disclosure, and House and Senate expenditure reporting. Prior to that I was Director of Computer-Assisted Reporting, at the Investigative Reporting Workshop, a nonprofit at American University. I also reported for several newspapers in Pennsylvania.

Long ago I was an undergraduate physics major, and got my first real taste of programming hacking on C++ code to look at engineering runs at LIGO Hanford.

I can be reached at jsfenfen at gmail dot com.

https://github.com/dannyedel/dspdfviewer/issues/163 这个pdf浏览器是能够正常查看无法使用poppler-util中自带的pdftohtml转换成正常中文的pdf文件

table detect

https://github.com/Booppey/table-detection https://github.com/transpect/evolve-hub

pdf ocr

OCRmyPDF
pdfsandwich

pdfsandwich generates "sandwich" OCR pdf files, i.e. pdf files which contain only images (no text) will be processed by optical character recognition (OCR) and the text will be added to each page invisibly "behind" the images.

pdfsandwich is a command line tool which is supposed to be useful to OCR scanned books or journals. It is able to recognize the page layout even for multicolumn text.

Essentially, pdfsandwich is a wrapper script which calls the following binaries: unpaper (since version 0.0.9), convert, gs, hocr2pdf (for tesseract prior to version 3.03), and tesseract. It is known to run on Unix systems and has been tested on Linux and MacOS X. It supports parallel processing on multiprocessor systems.

While pdfsandwich works with any version of tesseract from version 3.0 on, tesseract 3.03 or later is recommended for best performance. By default, pdfsandwich runs unpaper to enhance the readability of scanned pages and to improve OCR. For instance, slightly rotated pages are automatically straightened and dark edges removed. For optimally scanned pdf files, this can be switched off by option -nopreproc to speed up processing.

3. xps<-->png/jpeg

My work-around is to save the PDF as a lossless or near lossless image such as .tiff format, then create a new PDF from the image and run OCR. Thus I lose no clarity/sharpness in the PDF image and get accurate OCR content that can be copied and pasted. And, yes, lots of folks do something similar with screenshots from protected PDFs to grab all the text (without the need to retype it). Simple non-expert scripts (such as Tornado's "Do It Again" freeware) and PDF generating software make it easy to process hundreds of pages quickly and accurately (at least as accurately as OCR from images can be from relatively high-res images - not screenshots of documents you are not zooming in on or otherwise capturing with tremendously low spatial resolution relative to the original document).

https://github.com/wanghaisheng/pdfconvertme-public

4. pdf<-->png/jpeg

5. png/jpeg<-->json

1 Online service for xps<--->pdf

libgxps-utils

3. pdf<--->html<-->json

pdfminer

pdfminer pdf to html/txt demo

pdf2htmlEX

reference for pdf<--->html

others

The best way to view an XPS is to use Mupdf The best way to convert it to PDF is to use a wrapper around gxps. The best way to convert it to a PNG might be another wrapper around gxps or it might be to use Mudraw. And the best way to extract the text from an XPS is still to run KDE in a virtual machine.

pdf embed font的处理 http://stackoverflow.com/questions/11093051/handling-remapping-missing-problematic-cid-cjk-fonts-in-pdf-with-ghostscript?rq=1 https://github.com/pts/pdfsizeopt http://stackoverflow.com/questions/2656329/linux-pdf-postscript-optimizing http://www.aivosto.com/vbtips/pdf-optimize.html

http://stackoverflow.com/questions/21279548/facing-issues-on-extracting-text-from-pdf-file-using-java

http://stackoverflow.com/questions/29633504/embedded-fonts-in-pdf-copy-and-paste-problems?rq=1 http://stackoverflow.com/questions/18762625/get-information-whether-text-is-extractable-from-pdf?rq=1 http://stackoverflow.com/questions/30222424/copy-text-from-pdf-with-custom-font?rq=1 http://stackoverflow.com/questions/3488042/how-can-i-extract-embedded-fonts-from-a-pdf-as-valid-font-files/3489099#3489099

http://stackoverflow.com/questions/7140476/pdf-font-mapping-error?rq=1 http://stackoverflow.com/questions/11093051/handling-remapping-missing-problematic-cid-cjk-fonts-in-pdf-with-ghostscript?rq=1 http://stackoverflow.com/questions/25602262/ghostscript-re-encoding-embedded-font?rq=1 http://stackoverflow.com/questions/28797418/replace-all-font-glyphs-in-a-pdf-by-converting-them-to-outline-shapes?rq=1 http://stackoverflow.com/questions/15722099/issues-decoding-flate-from-pdf-embedded-font?rq=1 http://stackoverflow.com/questions/3647940/pdf-on-linux-combine-font-subsets-and-replace-type-3-with-type-1?rq=1 http://stackoverflow.com/questions/3036373/altering-an-embedded-truetype-font-so-it-will-be-usable-by-windows-gdi?rq=1

wanghaisheng commented 7 years ago

针对 xps pdf 图片 处理的整体pipeline xps----->xps解析模块------>json中间格式------>xps提取模块------->键值对 xps----->xps解析模块------>XML中间格式------>xps提取模块------->键值对 pdf----->pdf解析模块------>解析成txt------->pdf提取模块 pdf----->pdf解析模块 ------>解析成html pdf----->pdf解析模块 ------>解析成xml pdf----->pdf解析模块------>解析成json

wanghaisheng commented 7 years ago

wanghaisheng commented 7 years ago

for xpdf

由于该库与poppler-util功能一致但又无人维护 主要可以提取txt 提取html 查看字体 查看基本信息 提取嵌套的图片

236 docker run -it --rm --name pdf-miner-demo -v /home/wanghs/dockerfiles-repo/docker-for-fun/docker-alpine/projects/pdf-parser:/tmp dc/alpine-python2 /bin/sh

The Xpdf package honors these permission settings. Specifically:

xpdf will not copy/paste from a PDF file which disallows copying text/graphics
xpdf and pdftops will not print (convert to PostScript) a PDF file which disallows printing
pdftotext will not convert a PDF file which disallows copying text/graphics
pdfimages will not extract images from a PDF file which disallows copying text/graphics 
From ubuntu:15.10

# docker build -t dc/xpdf .

ADD sources.list /etc/apt/sources.list
ADD . /tmp
RUN cd /tmp && \
    tar xvf xpdfbin-linux-3.04.tar.gz && \
    cd xpdfbin-linux-3.04 && \
    cp bin64/* /usr/local/bin &&  mkdir /usr/local/man/man1 && mkdir /usr/local/man5 && cp doc/*.1 /usr/local/man/man1 && cp doc/*.5 /usr/local/man/man5 && \
    cd /tmp && tar xvf xpdf-chinese-simplified.tar.gz && \
    cd xpdf-chinese-simplified && mkdir /usr/local/share/xpdf &&  \
    mkdir /usr/local/share/xpdf/chinese-simplified && \
    mv * /usr/local/share/xpdf/chinese-simplified && \
    mv /tmp/xpdfrc /usr/local/etc/xpdfrc

xpdfrc

#========================================================================
#
# Sample xpdfrc file
#
# The Xpdf tools look for a config file in two places:
# 1. ~/.xpdfrc
# 2. in a system-wide directory, typically /usr/local/etc/xpdfrc
#
# This sample config file demonstrates some of the more common
# configuration options.  Everything here is commented out.  You
# should edit things (especially the file/directory paths, since
# they'll likely be different on your system), and uncomment whichever
# options you want to use.  For complete details on config file syntax
# and available options, please see the xpdfrc(5) man page.
#
# Also, the Xpdf language support packages each include a set of
# options to be added to the xpdfrc file.
#
# http://www.foolabs.com/xpdf/
#
#========================================================================

#----- display fonts

# These map the Base-14 fonts to the Type 1 fonts that ship with
# ghostscript.  You'll almost certainly want to use something like
# this, but you'll need to adjust this to point to wherever
# ghostscript is installed on your system.  (But if the fonts are
# installed in a "standard" location, xpdf will find them
# automatically.)

#fontFile Times-Roman       /usr/local/share/ghostscript/fonts/n021003l.pfb
#fontFile Times-Italic      /usr/local/share/ghostscript/fonts/n021023l.pfb
#fontFile Times-Bold        /usr/local/share/ghostscript/fonts/n021004l.pfb
#fontFile Times-BoldItalic  /usr/local/share/ghostscript/fonts/n021024l.pfb
#fontFile Helvetica     /usr/local/share/ghostscript/fonts/n019003l.pfb
#fontFile Helvetica-Oblique /usr/local/share/ghostscript/fonts/n019023l.pfb
#fontFile Helvetica-Bold        /usr/local/share/ghostscript/fonts/n019004l.pfb
#fontFile Helvetica-BoldOblique /usr/local/share/ghostscript/fonts/n019024l.pfb
#fontFile Courier       /usr/local/share/ghostscript/fonts/n022003l.pfb
#fontFile Courier-Oblique   /usr/local/share/ghostscript/fonts/n022023l.pfb
#fontFile Courier-Bold      /usr/local/share/ghostscript/fonts/n022004l.pfb
#fontFile Courier-BoldOblique   /usr/local/share/ghostscript/fonts/n022024l.pfb
#fontFile Symbol            /usr/local/share/ghostscript/fonts/s050000l.pfb
#fontFile ZapfDingbats      /usr/local/share/ghostscript/fonts/d050000l.pfb

# If you need to display PDF files that refer to non-embedded fonts,
# you should add one or more fontDir options to point to the
# directories containing the font files.  Xpdf will only look at .pfa,
# .pfb, .ttf, and .ttc files in those directories (other files will
# simply be ignored).

#fontDir        /usr/local/fonts/bakoma

#----- PostScript output control

# Set the default PostScript file or command.

#psFile         "|lpr -Pmyprinter"

# Set the default PostScript paper size -- this can be letter, legal,
# A4, or A3.  You can also specify a paper size as width and height
# (in points).

#psPaperSize        letter

#----- text output control

# Choose a text encoding for copy-and-paste and for pdftotext output.
# The Latin1, ASCII7, and UTF-8 encodings are built into Xpdf.  Other
# encodings are available in the language support packages.

#textEncoding       UTF-8

# Choose the end-of-line convention for multi-line copy-and-past and
# for pdftotext output.  The available options are unix, mac, and dos.

#textEOL        unix

#----- misc settings

# Enable FreeType, and anti-aliased text.

#enableFreeType     yes
#antialias      yes

# Set the command used to run a web browser when a URL hyperlink is
# clicked.

#launchCommand  viewer-script
#urlCommand "netscape -remote 'openURL(%s)'"
#----- begin Chinese Simplified support package (2011-sep-02)
cidToUnicode    Adobe-GB1   /usr/local/share/xpdf/chinese-simplified/Adobe-GB1.cidToUnicode
unicodeMap  ISO-2022-CN /usr/local/share/xpdf/chinese-simplified/ISO-2022-CN.unicodeMap
unicodeMap  EUC-CN      /usr/local/share/xpdf/chinese-simplified/EUC-CN.unicodeMap
unicodeMap  GBK     /usr/local/share/xpdf/chinese-simplified/GBK.unicodeMap
cMapDir     Adobe-GB1   /usr/local/share/xpdf/chinese-simplified/CMap
toUnicodeDir            /usr/local/share/xpdf/chinese-simplified/CMap

fontFileCC1 Adobe-GB1   /usr/local/share/xpdf/chinese-simplified/fonts/gbsn00lp.ttf
fontFileCC2 Adobe-GB1   /usr/local/share/xpdf/chinese-simplified/fonts/gkai00mp.ttf

#----- end Chinese Simplified support package

原始文件为

1.xps.zip

root@30c7a47055f5:/tmp# xpstopdf test-data/original-file/1.xps 

转换后得到的 1xpsto.pdf.zip

root@6334724bdee5:/tmp# pdffonts original-file/1xpsto.pdf
name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
LCRSCA+SimSun                        CID TrueType      Identity-H       yes yes yes      5  0
PAIKWA+SimSun                        TrueType          WinAnsi          yes yes yes      6  0

pdfminer 能够正常处理

root@30c7a47055f5:/tmp# pdf2txt.py -o test-data/original-file/1.x.output.html -Y exact test-data/original-file/1xpsto.pdf 

得到的结果为

1.x.output.html.zip

同样的一份报告 pdf如下 1.pdf.zip

root@6334724bdee5:/tmp# pdffonts 1.pdf
name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
SRPUEP+SimSun                        TrueType          WinAnsi          yes yes yes     13  0
root@6334724bdee5:/tmp# pdfinfo 1.pdf 
Creator:        Online2PDF.com
Producer:       Online2PDF.com
CreationDate:   Sat Aug 13 07:42:20 2016 UTC
Tagged:         no
UserProperties: no
Suspects:       no
Form:           none
JavaScript:     no
Pages:          1
Encrypted:      no
Page size:      594.75 x 419.25 pts
Page rot:       0
File size:      31707 bytes
Optimized:      no
PDF version:    1.4

pdfminer 处理的结果则为cid乱码

root@30c7a47055f5:/tmp# pdf2txt.py -o test-data/1.x.output.html -Y exact test-data/1.pdf

1.x.output.html.zip

惊喜的是 直接利用xpdf 自带的lib库 pdftohtml对上面xps得到的pdf进行处理

root@6334724bdee5:/tmp# pdftohtml  original-file/1xpsto.pdf  22.html

能够得到很好的结果

poppler本身是基于xpdf的 库也都是一样的 结果一致
[22s.zip](https://github.com/clear-datacenter/plan/files/524885/22s.zip)
[22-poppler.html.zip](https://github.com/clear-datacenter/plan/files/524886/22-poppler.html.zip)

解决方案似乎是在 fontconfig 配置将通过pdffonts 1.pdf检测出来的字体全部替换为标准字体 也就是我们期望中的几种 例如 SRPUEP+SimSun

https://lists.freedesktop.org/archives/poppler-bugs/2013-November/010909.html

1.对于这个特殊的pdf文件 使用 adobe reader 复制出来的就是乱码 啥文本都没有 2.使用OCRmyPDF转换后 编码丢失 https://github.com/jbarlow83/OCRmyPDF/issues/99 3.使用pdf.js 读取也未果 https://github.com/mozilla/pdf.js/issues/7712 按照pdf.js作者的建议 只能走OCR了

4.按照这里的建议使用gs 重建该pdf http://stackoverflow.com/questions/12703387/pdf-font-encoding-why-cant-i-copy-text-from-a-pdf http://stackoverflow.com/questions/12703387/pdf-font-encoding-why-cant-i-copy-text-from-a-pdf

root@6334724bdee5:/tmp# pdfinfo 11.pdf
Creator:        Online2PDF.com
Producer:       Online2PDF.com
CreationDate:   Sat Aug 13 07:42:20 2016 UTC
Tagged:         no
UserProperties: no
Suspects:       no
Form:           none
JavaScript:     no
Pages:          1
Encrypted:      no
Page size:      594.75 x 419.25 pts
Page rot:       0
File size:      31707 bytes
Optimized:      no
PDF version:    1.4
root@6334724bdee5:/tmp# pdfinfo 1.pdf
Creator:        Online2PDF.com
Producer:       Online2PDF.com
CreationDate:   Sat Aug 13 07:42:20 2016 UTC
Tagged:         no
UserProperties: no
Suspects:       no
Form:           none
JavaScript:     no
Pages:          1
Encrypted:      no
Page size:      594.75 x 419.25 pts
Page rot:       0
File size:      31707 bytes
Optimized:      no
PDF version:    1.4
root@6334724bdee5:/tmp# pdfinfo 2.pdf
Creator:        Online2PDF.com
Producer:       Online2PDF.com
CreationDate:   Sat Aug 13 07:42:20 2016 UTC
Tagged:         no
UserProperties: no
Suspects:       no
Form:           none
JavaScript:     no
Pages:          1
Encrypted:      no
Page size:      594.75 x 419.25 pts
Page rot:       0
File size:      37416 bytes
Optimized:      no
PDF version:    1.4

暂时套上OCR解决了问题 https://github.com/jbarlow83/OCRmyPDF/issues/99 pdfminer也能顺利提取文本了

看起来 需要将字体变换

可提取
root@6334724bdee5:/tmp# pdffonts 11-ocr-output.pdf 
name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
EGKPEK+GlyphLessFont                 CID TrueType      Identity-H       yes yes yes     10  0

不可提取
root@6334724bdee5:/tmp# pdffonts 11.pdf 
name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
SRPUEP+SimSun                        TrueType          WinAnsi          yes yes yes     13  0
wanghaisheng commented 7 years ago

for libgxps2 libgxps-utils 利用该库转换之前pdfminer识别会出现乱码的xps 能够正常显示中文

➜  xpstools git:(master) ✗ cat Dockerfile 

From ubuntu:15.10

# docker build -t dc/xpstools .
# docker run --rm --name xpstools -it -v $(pwd):/tmp dc/xpstools /bin/sh
ADD sources.list /etc/apt/sources.list
ADD .  /tmp
RUN apt-get -y update && apt-get install -y make gcc libgxps2 libgxps-utils 
#root@31c08e7de6a0:/tmp# xpstopdf 3.xps  

The default conversion on Ubuntu is in a library called libgxps. This is used by Evince (the default document viewer), and a number of command-line tools. One tool is xpstopdf, which sounds just right.

问题现在是无论是使用在线还是libgxps转换得到的pdf 对于测试文件3 利用pdfminer得到的html 版面分析都是错误的

wanghaisheng commented 7 years ago

from xps <-->pdf<-->html<-->json

for pdfminer

based on ALPINE LINUX

# docker build -t dc/alpine-python2 .
FROM dc/alpine

RUN apk update && apk upgrade

RUN apk add python

# Clean APK cache
RUN rm -rf /var/cache/apk/*

docker run -it --rm --name pdf-miner-demo -v /home/wanghs/dockerfiles-repo/docker-for-fun/docker-alpine/projects/pdf-parser:/tmp dc/alpine-python2 /bin/sh

将pdfminer嵌在or的镜像里 提供http服务


[wanghs@db2 alpine-or-python2-based-pdfminer]$ cat Dockerfile
FROM dc/openresty-alpine

#  docker build -t dc/alpine-or-python2-pdfminer .
RUN apk update && apk upgrade

RUN apk add python git

# Clean APK cache
RUN rm -rf /var/cache/apk/*

RUN cd /tmp &&  git https://github.com/euske/pdfminer && \
   cd pdfminer && python setup.py install
FROM alpine:3.3
MAINTAINER edwin_uestc <edwin_uestc@163.com>

ENV LUA_SUFFIX=jit-2.1.0-beta1 \
    LUAJIT_VERSION=2.1 \
    NGINX_PREFIX=/opt/openresty/nginx \
    OPENRESTY_PREFIX=/opt/openresty \
    OPENRESTY_SRC_SHA1=1a2029e1c854b6ac788b4d734dd6b5c53a3987ff \
    OPENRESTY_VERSION=1.9.7.3 \
    VAR_PREFIX=/var/nginx

RUN sed -i 's/dl-cdn.alpinelinux.org/mirrors.ustc.edu.cn/' /etc/apk/repositories

RUN set -ex \
  && apk --no-cache add --virtual .build-dependencies \
    curl \
    make \
    musl-dev \
    gcc \
    ncurses-dev \
    openssl-dev \
    pcre-dev \
    perl \
    readline-dev \
    zlib-dev \
  \
  && curl -fsSL http://openresty.org/download/openresty-${OPENRESTY_VERSION}.tar.gz -o /tmp/openresty.tar.gz \
  \
  && cd /tmp \
  && echo "${OPENRESTY_SRC_SHA1} *openresty.tar.gz" | sha1sum -c - \
  && tar -xzf openresty.tar.gz \
  \
  && cd openresty-* \
  && readonly NPROC=$(grep -c ^processor /proc/cpuinfo 2>/dev/null || 1) \
  && ./configure \
    --prefix=${OPENRESTY_PREFIX} \
    --http-client-body-temp-path=${VAR_PREFIX}/client_body_temp \
    --http-proxy-temp-path=${VAR_PREFIX}/proxy_temp \
    --http-log-path=${VAR_PREFIX}/access.log \
    --error-log-path=${VAR_PREFIX}/error.log \
    --pid-path=${VAR_PREFIX}/nginx.pid \
    --lock-path=${VAR_PREFIX}/nginx.lock \
    --with-luajit \
    --with-pcre-jit \
    --with-ipv6 \
    --with-http_ssl_module \
    --without-http_ssi_module \
    --with-http_realip_module \
    --without-http_scgi_module \
    --without-http_uwsgi_module \
    --without-http_userid_module \
    -j${NPROC} \
  && make -j${NPROC} \
  && make install \
  \
  && rm -rf /tmp/openresty-* \
  && apk del .build-dependencies

RUN ln -sf ${NGINX_PREFIX}/sbin/nginx /usr/local/bin/nginx \
  && ln -sf ${NGINX_PREFIX}/sbin/nginx /usr/local/bin/openresty \
  && ln -sf ${OPENRESTY_PREFIX}/bin/resty /usr/local/bin/resty \
  && ln -sf ${OPENRESTY_PREFIX}/luajit/bin/luajit-* ${OPENRESTY_PREFIX}/luajit/bin/lua \
  && ln -sf ${OPENRESTY_PREFIX}/luajit/bin/luajit-* /usr/local/bin/lua

RUN apk --no-cache add \
    libgcc \
    libpcrecpp \
    libpcre16 \
    libpcre32 \
    libssl1.0 \
    libstdc++ \
    openssl \
    pcre

WORKDIR $NGINX_PREFIX

CMD ["nginx", "-g", "daemon off; error_log /dev/stderr info;"]

https://github.com/felipeochoa/minecart

based on UBUTUN


From ubuntu:15.10

#  docker build -t dc/pdfminer-programming .
# docker run --rm --name pdfminer-programming-demo -it -v $(pwd):/tmp dc/pdfminer-programming /bin/sh
ADD sources.list /etc/apt/sources.list

RUN apt-get -y update && apt-get install -y make gcc git curl libgxps2 libgxps-utils   xz-utils zlib1g-dev python-pip

# http://bugs.python.org/issue19846
# > At the moment, setting "LANG=C" on a Linux system *fundamentally breaks Python 3*, and that's not OK.
ENV LANG C.UTF-8

# gpg: key 18ADD4FF: public key "Benjamin Peterson <benjamin@python.org>" imported
RUN gpg --keyserver ha.pool.sks-keyservers.net --recv-keys C01E1CAD5EA2C4F0B8E3571504C367C218ADD4FF

ENV PYTHON_VERSION 2.7.12

# if this is called "PIP_VERSION", pip explodes with "ValueError: invalid truth value '<VERSION>'"
ENV PYTHON_PIP_VERSION 8.1.2

RUN set -x \
    && mkdir -p /usr/src/python \
    && curl -L "http://mirrors.sohu.com/python/$PYTHON_VERSION/Python-$PYTHON_VERSION.tar.xz" -o python.tar.xz \
    && tar -xJC /usr/src/python --strip-components=1 -f python.tar.xz \
    && rm python.tar.xz* \
    && cd /usr/src/python \
    && ./configure --enable-shared --enable-unicode=ucs4 \
    && make -j$(nproc) \
    && make install \
    && ldconfig \
    && pip install -i https://pypi.tuna.tsinghua.edu.cn/simple  -U pip  \
    && pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --upgrade pip==$PYTHON_PIP_VERSION \
    && find /usr/local \
        \( -type d -a -name test -o -name tests \) \
        -o \( -type f -a -name '*.pyc' -o -name '*.pyo' \) \
        -exec rm -rf '{}' + \
    && rm -rf /usr/src/python

# install "virtualenv", since the vast majority of users of this image will want it
RUN pip install --no-cache-dir virtualenv

#root@31c08e7de6a0:/tmp# xpstopdf 3.xps  

RUN cd /tmp  && \
    git clone https://github.com/wanghaisheng/pdfminer && \
   cd pdfminer  && make cmap &&  python setup.py install 

步骤一 无论文件是不是pdf 都使用转换工具重新转换一遍 这个可能需要gs https://github.com/coolwanglu/pdf2htmlEX/wiki/Optimizing-PDF-Files


Lots of PDF files have contents that might not be necessary, for example, annotations, unused objects and deleted objects. These information can be removed without affecting the visual, while making the PDF files smaller.

Note that pdf2htmlEX is designed to be a converter but not an optimizer, so generally it's a good idea to optimize the PDF file before feed them to pdf2htmlEX.
PDF Optimizers

Ghostscript is an open source tool that can be used to optimize PDF file. You can try gs -sDEVICE=pdfwrite -sOutputFile='output.pdf' -dNOPAUSE -dBATCH input.pdf, or more advanced options. You should also read its documentation for the full power of Ghostscript.

Others include Adobe Acrobat, or any you can find online. The tools have different advantages and disadvantages, so you should try different ones on your files and find the best one for you.

步骤二 使用pdfminer生成html文件

pdf2txt.py -o output.html -Y exact 3.xps.pdf

步骤三 对生成的html进行预处理得到raw json 步骤四 将raw json拆解成目标json 对于每一个块计算行数

import subprocess subprocess.call("pdf2htmlEX /path/to/foobar.pdf", shell=True)

wanghaisheng commented 7 years ago

from xps <-->pdf<--->png/jpeg<-->html<-->json

wanghaisheng commented 7 years ago

from xps <--->png/jpeg pdf<-->png/jpeg

wanghaisheng commented 7 years ago
  1. 如果两个top值小于设置的阈值 比如说3px 则将较大的替换为较小的 比如
  2. 根据
<span style="position:absolute; color:black; left:247px; top:82px; font-size:26px;">属</span>

中top可能值的数目决定整个版面的行数 如果同样的top值后续出现比它小的top 然后又重复出现原来的top值 需要将这两组top合并为一组 以较小top值为准

这里如果同样的top值后续出现比它大的多的top 然后又重复出现原来的top值 需要将该大很多的top移动到对应的顺序中去

<span style="position:absolute; color:black; left:49px; top:154px; font-size:12px;">姓</span>
<span style="position:absolute; color:black; left:62px; top:154px; font-size:12px;">名</span>
<span style="position:absolute; color:black; left:76px; top:154px; font-size:12px;">:</span>

<span style="position:absolute; color:black; left:94px; top:152px; font-size:14px;">姚</span>
<span style="position:absolute; color:black; left:110px; top:152px; font-size:14px;">荣</span>
<span style="position:absolute; color:black; left:126px; top:152px; font-size:14px;">炳</span>

<span style="position:absolute; color:black; left:205px; top:154px; font-size:12px;">性</span>
<span style="position:absolute; color:black; left:219px; top:154px; font-size:12px;">别</span>
<span style="position:absolute; color:black; left:232px; top:154px; font-size:12px;">:</span>
  1. 对于html中包含多个border的情况,先对所有border的span 按照width排序 去重 只保留最大的10个或5个值(可配置 ) 然后根据left值进行排序 移除left值大于50 width值小于300的span

    1.对width排序 移除第五位以下的所有span(移除width小于500的)

    1. 对于width值排前五的span 移除left值第五位(这里取50)
    2. 对于剩下的按照top排序 如果遇到相同top值的span 保留width最大的那一个

以线为分隔符 将整个版面划分为若干块

<span style="position:absolute; border: black 1px solid; left:39px; top:230px; width:520px; height:0px;"></span>
<span style="position:absolute; border: black 1px solid; left:39px; top:149px; width:520px; height:0px;"></span>
<span style="position:absolute; border: black 1px solid; left:39px; top:859px; width:520px; height:0px;"></span>

4 .对块内所有出现的top值进行排序 如果相邻两个top值的差小于该两个top值对应的font-size之和,则认为该两个相邻的top为同一line,如果差大于该两个top值对应的font-size之和,则认为是不同的line

5 . 在单独的每个分块中对字体进行归一化 以小的为准 且对同样的top值的所有left值进行排序 如果连续两个left值之差小于font-size 则 对比较大的left(包含该left值在内)所有left值加上font-size的值

6 .如果同样top的两个left值之差大于两个font-size值/或可配置的值 则认为其是block区块之间具有意义的分隔 其他则视为不具有意义 为区块内部的分隔 区块内部所有的值中间的空格需要移除 (亦可以对所有left值进行归一化 保持最小值,按照次序在最小值基础上依次加上font-size值 这种方法如果top标记的本身就是空格 则拼接字符串后仍然需要移除空格 故放弃)

7 .同样的top 也就是同一行中 要计算下一个字符与上一个字符是否属于同一个segment,只需要在第一个字符的left值基础上加上从第一个字符到待计算字符之间所有字符的font-size值, 如果对应字符的left值小于该预期值,则认为该字符与上一个字符属于同一个segment 如果对应字符的left值大于该预期值,则认为该字符与上一个字符属于不同的segment

初步处理后的json如下所示: 由于无法将值与键分开 暂时将所有内容放在值里面 待后续处理

block 代表由横线等分割出的大块
lines 代表每块中的行
segments 代表每行中的不同段落

confidence 代表

{
    "blocks": [
        {
            "id": 1,
            "lines": [
                {
                    "id": 1,
                    "segments": [
                        {
                            "id": 1,
                            "confidence": "AAAAAA",
                            "pos": "160-42-192-74|197-42-229-75|237-47-246-71|466-42-498-74|503-42-535-75|540-47-554-71",
                            "key": "南 京 医 科 大 学 附 属 常 州 市 第 二 人 民 医 院"
                        }
                    ]
                },
                {
                    "id": 2,
                    "segments": [
                        {
                            "id": 1,
                            "confidence": "AAAAAA",
                            "pos": "160-42-192-74|197-42-229-75|237-47-246-71|466-42-498-74|503-42-535-75|540-47-554-71",                            
                            "key": "M R I 诊 断 报 告 单"
                        }
                    ]
                }
            ]
        },
        {
            "id": 2,
            "lines": [
                {
                    "id": 1,
                    "segments": [
                        {
                            "id": 1,
                            "confidence": "AAAAAA",
                            "pos": "160-42-192-74|197-42-229-75|237-47-246-71|466-42-498-74|503-42-535-75|540-47-554-71",                            
                            "key": "姓名:姚荣炳"
                        },
                        {
                            "id": 2,
                            "confidence": "AAAAAA",
                            "pos": "160-42-192-74|197-42-229-75|237-47-246-71|466-42-498-74|503-42-535-75|540-47-554-71",
                            "key": "性别:男"
                        },
                        {
                            "id": 3,
                            "confidence": "AAAAAA",
                            "pos": "160-42-192-74|197-42-229-75|237-47-246-71|466-42-498-74|503-42-535-75|540-47-554-71",                            
                            "key": "年龄:78岁"
                        },
                        {
                            "id": 4,
                            "confidence": "AAAAAA",
                            "pos": "160-42-192-74|197-42-229-75|237-47-246-71|466-42-498-74|503-42-535-75|540-47-554-71",                            
                            "key": "检查号:F1440092"
                        }
                    ]
                },
                {
                    "id": 2,
                    "segments": [
                        {
                            "id": 1,
                            "confidence": "AAAAAA",
                            "pos": "160-42-192-74|197-42-229-75|237-47-246-71|466-42-498-74|503-42-535-75|540-47-554-71",                            
                            "key": "科 室 : 神 经 内 科  "
                        },
                        {
                            "id": 2,
                            "confidence": "AAAAAA",
                            "pos": "160-42-192-74|197-42-229-75|237-47-246-71|466-42-498-74|503-42-535-75|540-47-554-71",                            
                            "key": "病 区 :十八病区"
                        },
                        {
                            "id": 3,
                            "confidence": "AAAAAA",
                            "pos": "160-42-192-74|197-42-229-75|237-47-246-71|466-42-498-74|503-42-535-75|540-47-554-71",                            
                            "key": "床 号 : 3 4  "
                        },
                        {
                            "id": 4,
                            "confidence": "AAAAAA",
                            "pos": "160-42-192-74|197-42-229-75|237-47-246-71|466-42-498-74|503-42-535-75|540-47-554-71",                            
                            "key": " 住 院 号 : 6 9 9 0 7 5"
                        }
                    ]
                },
                {
                    "id": 3,
                    "segments": [
                        {
                            "id": 1,
                            "confidence": "AAAAAA",
                            "pos": "160-42-192-74|197-42-229-75|237-47-246-71|466-42-498-74|503-42-535-75|540-47-554-71",                            
                            "key": "门诊号: "
                        },
                        {
                            "id": 2,
                            "confidence": "AAAAAA",
                            "pos": "160-42-192-74|197-42-229-75|237-47-246-71|466-42-498-74|503-42-535-75|540-47-554-71",                            
                            "key": " 报告日期:2015-06-15"
                        },
                        {
                            "id": 3,
                            "confidence": "AAAAAA",
                            "pos": "160-42-192-74|197-42-229-75|237-47-246-71|466-42-498-74|503-42-535-75|540-47-554-71",                            
                            "key": "报告时间 : 17:14:28 "
                        }
                    ]
                }
            ]
        },
        {
            "id": 3,
            "lines": [
                {
                    "id": 1,
                    "segments": [
                        {
                            "id": 1,
                            "confidence": "AAAAAA",
                            "pos": "160-42-192-74|197-42-229-75|237-47-246-71|466-42-498-74|503-42-535-75|540-47-554-71",                            
                            "key": "检查名称:颅脑MR平扫+弥散成像"
                        }
                    ]
                },
                {
                    "id": 2,
                    "segments": [
                        {
                            "id": 1,
                            "confidence": "AAAAAA",
                            "pos": "160-42-192-74|197-42-229-75|237-47-246-71|466-42-498-74|503-42-535-75|540-47-554-71",                            
                            "key": "影像表现 : xxxxxx  "
                        }
                    ]
                },
                {
                    "id": 3,
                    "segments": [
                        {
                            "id": 1,
                            "confidence": "AAAAAA",
                            "pos": "160-42-192-74|197-42-229-75|237-47-246-71|466-42-498-74|503-42-535-75|540-47-554-71",                            
                            "key": "诊断:(1)xxxxxx (2)xxxxx (3)xxxxxxx "
                        }
                    ]
                },
                {
                    "id": 4,
                    "segments": [
                        {
                            "id": 1,
                            "confidence": "AAAAAA",
                            "pos": "160-42-192-74|197-42-229-75|237-47-246-71|466-42-498-74|503-42-535-75|540-47-554-71",                            
                            "key": "建议: "
                        }
                    ]
                },
                {
                    "id": 5,
                    "segments": [
                        {
                            "id": 1,
                            "confidence": "AAAAAA",
                            "pos": "160-42-192-74|197-42-229-75|237-47-246-71|466-42-498-74|503-42-535-75|540-47-554-71",                            
                            "key": "报告医师: "
                        },
                        {
                            "id": 2,
                            "confidence": "AAAAAA",
                            "pos": "160-42-192-74|197-42-229-75|237-47-246-71|466-42-498-74|503-42-535-75|540-47-554-71",                            
                            "key": " 审核医师:"
                        }
                    ]
                }
            ]
        },
        {
            "id": 4,
            "lines": [
                {
                    "id": 1,
                    "segments": [
                        {
                            "id": 1,
                            "confidence": "AAAAAA",
                            "pos": "160-42-192-74|197-42-229-75|237-47-246-71|466-42-498-74|503-42-535-75|540-47-554-71",                            
                            "key": "本报告仅供临床参考"
                        }
                    ]
                },
                {
                    "id": 2,
                    "segments": [
                        {
                            "id": 1,
                            "confidence": "AAAAAA",
                            "pos": "160-42-192-74|197-42-229-75|237-47-246-71|466-42-498-74|503-42-535-75|540-47-554-71",                            
                            "key": "金马扬名  www.jinpacs.com"
                        }
                    ]
                }
            ]
        }
    ]
}
wanghaisheng commented 7 years ago

其他参考资料

Fonts in PDF

With the goal of getting tests\viewer.py to render fonts appropriately, I've been reading about how to extract font programs from PDF documents so that they can be displayed on tkinter. Loading fonts into tkinter is a non-trivial task, but I have a solution for loading fonts on Windows, provided they are in a recognized format and we have the name of the family. (see my stackoverflow answer for details).

The question then becomes: how do we extract the font family name and the embedded font program (if any) from the PDF document? I'm putting together this wiki page to keep track of my efforts towards that question.

How are fonts stored/referenced in PDF?

When drawing text on a PDF page, the application keeps track of what's known as the text state. In the text state, there is a parameter Tf  called text font. Whenever text is drawn on the page, it is drawn using the font stored in the Tf  field of the text state. The text font is set and updated through the use of the Tf graphics operator.

When using the Tf operator, the first argument is "the name of a font resource in the Font subdictionary of the current resource dictionary" (p. 398). These font resources are themselves dictionaries, identified by having their 'Type' set to /Font [1]. Using minecart and pdfminer, we can explore these structures with the following code:

import minecart
import pdfminer.pdfpage
doc = minecart.Document(open("path/to/sample.pdf", 'rb'))
page = next(pdfminer.pdfpage.PDFPage.create_pages(doc.doc))
fonts = page.resources['Font']
print fonts
# {'F0': <PDFObjRef:7>}
font = fonts['F0'].resolve()
print font
# {'Encoding': /Identity-H,
#  'BaseFont': /HDIABS+AlbanyWTTC-Identity-H,
#  'DescendantFonts': [<PDFObjRef:26>],
#  'Subtype': /Type0,
#  'ToUnicode': <PDFObjRef:25>,
#  'Type': /Font}

At this point, the exercise become more of a choose-your-own-adventure, since it will largely depend on the fonts that are referenced in your document.

The different types of PDF fonts

PDF allows documents to use a variety of font formats, which can be embedded with the document, included in the viewer application, or found elsewhere in the system. Font types are identified by the /Subtype entry in the dictionary; fonts can be in the following formats:

Type 1

/Subtype = /Type1. Type 1 fonts fall into two categories, distinguished by their /BaseFont key.

Multiple Master

/Subtype = /MMType1. Per Wikipedia, "Current application support for these fonts is sparse, if not entirely absent." So we won't support them either.

TrueType

/Subtype = /TrueType.

TrueType fonts must also have a /BaseFont key, whose value may be used to look up the font in the central repository. In some (rare ?) cases [3], the name may be mangled and thus not usable. The entire font program can be embedded under the /FontFile2 key of the /FontDescriptor subdictionary. It can also appear under the /FontFile3 key as an OpenType font program if the stream has subtype /OpenType.

Type 3

/Subtype = /Type3.

Type 3 fonts have no font program to embed or reference. Instead, they specify PDF graphics procedures for rendering each character as a PDF shape. Rendering the text is thus a job for the shape engine and not for the text engine. I'd have to investigate how pdfminer handles Type3 fonts, since it's possible this is taken care of already. If not, it would require adding support for Type 3 fonts through the interpreter class.

Type 0

/Subtype = /Type0.

Type 0 fonts are also called "composite fonts" in the spec. They have a "subfont" that's stored in the /DescendantFonts entry of the main font dicitonary.[4] Type 0 fonts can contain two types of embedded subfonts, distinguished by the value of their /SubType entry:

Footnotes

  1. Strictly speaking, it's not sufficient, since Type 2 CIDFonts also have Type set to /Font, but aren't actually PDF font instances. (back to content)
  2. For both Type 1 and TrueType fonts, the /BaseFont entry may begin with 6 uppercase letters followed by a + sign that are extraneous to the font's family name and should be stripped out. This naming style indicates that only a subset of the font is used. I'm still not sure how to deal with these. (back to content)
  3. Namely, when the font doesn't include the optional PostScript name in the name table and the font family name has spaces in it. The name can also be mangled if "the font in a source document uses a bold or italic style but there is no font data for that style" (p. 418). (back to content)
  4. The actual value stored is a 1-element array containing the subfont as its only element. (back to content)
wanghaisheng commented 7 years ago

for pdf2htmlEX

弃用 1.社区不活跃 2.目前无法将转换得到的html中css转换成 如下形式

<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head><body>
<span style="position:absolute; border: gray 1px solid; left:0px; top:50px; width:793px; height:559px;"></span>
<div style="position:absolute; top:50px;"><a name="1">Page 1</a></div>
<span style="position:absolute; color:black; left:196px; top:60px; font-size:21px;">福</span>
<span style="position:absolute; color:black; left:217px; top:60px; font-size:21px;">建</span>
<span style="position:absolute; color:black; left:238px; top:60px; font-size:21px;">医</span>

https://hub.docker.com/r/bwits/pdf2htmlex-alpine/

#Dockerfile to build a pdf2htmlEx image
FROM debian:wheezy

ENV REFRESHED_AT 20151007

# update debian source list
RUN echo "deb http://ftp.de.debian.org/debian sid main" >> /etc/apt/sources.list && \
    apt-get -qqy update && \
    apt-get -qqy install pdf2htmlex && \
    rm -rf /var/lib/apt/lists/*

VOLUME /pdf
WORKDIR /pdf

CMD ["pdf2htmlEX"]

based on Ubuntu

FROM ubuntu:15.04

ENV PDF2HTML_VERSION          0.12-1~git201411121058r1a6ec-0ubuntu1~precise1
ENV FONTFORGE_VERSION         20150612-0ubuntu1~precise
ENV NODEJS_VERSION            0.10.37-1chl1~precise1

MAINTAINER stephane.ledorze@gmail.com

RUN echo "deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ precise multiverse" >> /etc/apt/sources.list
RUN echo "deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ precise-updates multiverse" >> /etc/apt/sources.list

# Freshen up ubuntu
RUN apt-get update
RUN apt-get -y dist-upgrade

RUN apt-get install -y software-properties-common python-software-properties apt-utils
RUN apt-get -y install curl wget

#install fontforge PPA
#RUN apt-add-repository ppa:fontforge/fontforge

# install nodeJs PPA
RUN curl -sL https://deb.nodesource.com/setup_0.12 | bash -

RUN apt-get update

#
#Install git and all dependencies
# libtiff4-dev
RUN apt-get install -y sudo git cmake autotools-dev libjpeg-dev  libpng12-dev libgif-dev libxt-dev autoconf automake libtool bzip2 libxml2-dev libuninameslist-dev libspiro-dev python-dev libpango1.0-dev libcairo2-dev chrpath uuid-dev uthash-dev

#
#Clone the pdf2htmlEX fork of fontforge
#compile it
#
RUN git clone https://github.com/coolwanglu/fontforge.git fontforge.git
RUN cd fontforge.git && git checkout pdf2htmlEX && ./autogen.sh && ./configure && make V=1 && sudo make install

#
#Install poppler utils
#
RUN apt-get install -y libpoppler-glib-dev poppler-utils libpoppler-dev gir1.2-poppler-0.18 libpoppler-cil-dev libpopplerkit-dev libpoppler-cpp-dev libpoppler-private-dev

#
#Install cairo utils
#
RUN apt-get install -y libcairo2-dev libghc-svgcairo-dev

#
#Clone and install the pdf2htmlEX git repo
#
RUN git clone git://github.com/coolwanglu/pdf2htmlEX.git
RUN cd pdf2htmlEX && cmake . && make && sudo make install

span前面空格数量不同

替换“ ” 为空

替换" "为空 替换" " 为空 这里的1可以是任意数字 span的值为空

替换" )"为空

https://github.com/fmalina/transcript 借助这个脚本来理解其中PDF2htmlEX中一些字段含义

pdf2htmlEX --external-hint-tool=ttfautohint --auto-hint 1 --zoom 2

转换得到的css html是分离的 意味着和pdfminer不同 如果要直接处理的话 需要把css的style 弄成html inline的形式 借助工具 https://github.com/davecranwell/inline-styler https://github.com/rennat/pynliner

关键词inline style attributes to style tags

https://www.npmjs.com/package/gulp-inline-css https://github.com/christiaan/InlineStyle

root@6e4b6b36f399:/tmp# pdf2htmlEX --external-hint-tool=ttfautohint --auto-hint 1 --zoom 2 --tounicode 1   1.pdf 

结果

1.html.zip

pdf2htmlex --external-hint-tool=ttfautohint --auto-hint 1 --zoom 2  --tounicode 1 --correct-text-visibility 1 --process-nontext 1 --remove-unsued-glyph 0 1.pdf 
wanghaisheng commented 7 years ago

html to json https://github.com/fb55/htmlparser2

https://github.com/inikulin/parse5 http://demos.forbeslindesay.co.uk/htmlparser2/ http://astexplorer.net/#/1CHlCXc4n4 https://github.com/cheeriojs/cheerio

var cheerio = require('cheerio');
var fs = require('fs');  
var css2json = require('./css2json.js')      
$ = cheerio.load(require('fs').readFileSync('original_检验报告_1_xps_from_online_pdfmier.htm','utf-8'));

console.log($("[style*=top]").length)
console.log($("[style]").not("[style*=top]").length)
console.log($("span[style*=border]").length);
console.log($("span[style]").not("span[style*=border]").length);

var data = JSON.stringify($("span[style*=border]").attr('style'));
var border_tops = [];
$("span[style*=border]").each(function(i, elem) {
  top = (css2json.css2json(JSON.stringify( $(this).attr('style'))))["top"]; 

  border_tops.push(top);
});

console.log(uniq_fast(border_tops));

function uniq_fast(a) {
    var seen = {};
    var out = [];
    var len = a.length;
    var j = 0;
    for(var i = 0; i < len; i++) {
         var item = a[i];
         if(seen[item] !== 1) {
               seen[item] = 1;
               out[j++] = item;
         }
    }
    return out;
}

//去掉border最大和最小值

for (var i=1;i<uniq_fast(border_tops).length-1;i++)
{

var block =uniq_fast(border_tops)[i];
var span_tops = [];

$("span[style]").not("span[style*=border]").each(function(i, elem) {
  top = (css2json.css2json(JSON.stringify( $(this).attr('style'))))["top"]; 
  if(parseInt(top) < parseInt("62px")){

      console.log("y坐标为:"+parseInt(top)+"边界为:"+parseInt(block)+"---"+"第一块");
      console.log($('span[style*="top:'+top+'"]').not("span[style*=border]").length);
//    console.log($.html('span[style*="top:'+top+'"]').html());
//判断同一个top 同一行中left值 来区分不同的segment
      console.log($(this).html());
  }

  span_tops.push(top);
});
}
<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head><body>
<span style="position:absolute; border: gray 1px solid; left:0px; top:50px; width:793px; height:559px;"></span>
<div style="position:absolute; top:50px;"><a name="1">Page 1</a></div>
<span style="position:absolute; color:black; left:196px; top:60px; font-size:21px;">福</span>
<span style="position:absolute; color:black; left:217px; top:60px; font-size:21px;">建</span>
<span style="position:absolute; color:black; left:238px; top:60px; font-size:21px;">医</span>
<span style="position:absolute; color:black; left:260px; top:60px; font-size:21px;">科</span>
<span style="position:absolute; color:black; left:281px; top:60px; font-size:21px;">大</span>
<span style="position:absolute; color:black; left:303px; top:60px; font-size:21px;">学</span>
<span style="position:absolute; color:black; left:324px; top:60px; font-size:21px;">附</span>
<span style="position:absolute; color:black; left:346px; top:60px; font-size:21px;">属</span>
<span style="position:absolute; color:black; left:367px; top:60px; font-size:21px;">第</span>
<span style="position:absolute; color:black; left:388px; top:60px; font-size:21px;">一</span>
<span style="position:absolute; color:black; left:410px; top:60px; font-size:21px;">医</span>
<span style="position:absolute; color:black; left:431px; top:60px; font-size:21px;">院</span>
<span style="position:absolute; color:black; left:453px; top:60px; font-size:21px;">检</span>
<span style="position:absolute; color:black; left:474px; top:60px; font-size:21px;">验</span>
<span style="position:absolute; color:black; left:495px; top:60px; font-size:21px;">报</span>
<span style="position:absolute; color:black; left:517px; top:60px; font-size:21px;">告</span>
<span style="position:absolute; color:black; left:538px; top:60px; font-size:21px;">单</span>
<span style="position:absolute; color:black; left:634px; top:56px; font-size:12px;">【</span>
<span style="position:absolute; color:black; left:647px; top:56px; font-size:12px;">免</span>
<span style="position:absolute; color:black; left:659px; top:56px; font-size:12px;">疫</span>
<span style="position:absolute; color:black; left:671px; top:56px; font-size:12px;">】</span>
<span style="position:absolute; color:black; left:683px; top:56px; font-size:12px;">1</span>
<span style="position:absolute; color:black; left:689px; top:56px; font-size:12px;">0</span>
<span style="position:absolute; color:black; left:695px; top:56px; font-size:12px;">.</span>
<span style="position:absolute; color:black; left:701px; top:56px; font-size:12px;">1</span>
<span style="position:absolute; color:black; left:707px; top:56px; font-size:12px;">3</span>

<span style="position:absolute; color:black; left:188px; top:88px; font-size:12px;">门</span>
<span style="position:absolute; color:black; left:200px; top:88px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:206px; top:88px; font-size:12px;">诊</span>
<span style="position:absolute; color:black; left:218px; top:88px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:224px; top:88px; font-size:12px;">号</span>
<span style="position:absolute; color:black; left:236px; top:88px; font-size:12px;">:</span>
<span style="position:absolute; color:black; left:245px; top:88px; font-size:12px;">3</span>
<span style="position:absolute; color:black; left:251px; top:88px; font-size:12px;">0</span>
<span style="position:absolute; color:black; left:257px; top:88px; font-size:12px;">3</span>
<span style="position:absolute; color:black; left:263px; top:88px; font-size:12px;">7</span>
<span style="position:absolute; color:black; left:269px; top:88px; font-size:12px;">0</span>
<span style="position:absolute; color:black; left:275px; top:88px; font-size:12px;">8</span>
<span style="position:absolute; color:black; left:281px; top:88px; font-size:12px;">9</span>
<span style="position:absolute; color:black; left:287px; top:88px; font-size:12px;">4</span>
<span style="position:absolute; color:black; left:293px; top:88px; font-size:12px;">6</span>
<span style="position:absolute; color:black; left:300px; top:88px; font-size:12px;">0</span>
<span style="position:absolute; color:black; left:340px; top:88px; font-size:12px;">申</span>
<span style="position:absolute; color:black; left:352px; top:88px; font-size:12px;">请</span>
<span style="position:absolute; color:black; left:364px; top:88px; font-size:12px;">医</span>
<span style="position:absolute; color:black; left:376px; top:88px; font-size:12px;">生</span>
<span style="position:absolute; color:black; left:388px; top:88px; font-size:12px;">:</span>
<span style="position:absolute; color:black; left:402px; top:88px; font-size:12px;">郭</span>
<span style="position:absolute; color:black; left:414px; top:88px; font-size:12px;">玉</span>
<span style="position:absolute; color:black; left:426px; top:88px; font-size:12px;">佳</span>
<span style="position:absolute; color:black; left:438px; top:88px; font-size:12px;">/</span>
<span style="position:absolute; color:black; left:444px; top:88px; font-size:12px;">2</span>
<span style="position:absolute; color:black; left:450px; top:88px; font-size:12px;">9</span>
<span style="position:absolute; color:black; left:456px; top:88px; font-size:12px;">8</span>
<span style="position:absolute; color:black; left:462px; top:88px; font-size:12px;">5</span>

<span style="position:absolute; color:black; left:666px; top:70px; font-size:12px;">N</span>
<span style="position:absolute; color:black; left:672px; top:70px; font-size:12px;">O</span>
<span style="position:absolute; color:black; left:679px; top:70px; font-size:12px;">.</span>
<span style="position:absolute; color:black; left:691px; top:70px; font-size:12px;">1</span>
<span style="position:absolute; color:black; left:697px; top:70px; font-size:12px;">2</span>
<span style="position:absolute; color:black; left:703px; top:70px; font-size:12px;">6</span>
<span style="position:absolute; color:black; left:709px; top:70px; font-size:12px;">9</span>
<span style="position:absolute; color:black; left:523px; top:88px; font-size:12px;">申</span>
<span style="position:absolute; color:black; left:535px; top:88px; font-size:12px;">请</span>
<span style="position:absolute; color:black; left:547px; top:88px; font-size:12px;">时</span>
<span style="position:absolute; color:black; left:559px; top:88px; font-size:12px;">间</span>
<span style="position:absolute; color:black; left:571px; top:88px; font-size:12px;">:</span>
<span style="position:absolute; color:black; left:585px; top:88px; font-size:12px;">2</span>
<span style="position:absolute; color:black; left:591px; top:88px; font-size:12px;">0</span>
<span style="position:absolute; color:black; left:597px; top:88px; font-size:12px;">1</span>
<span style="position:absolute; color:black; left:603px; top:88px; font-size:12px;">5</span>
<span style="position:absolute; color:black; left:609px; top:88px; font-size:12px;">.</span>
<span style="position:absolute; color:black; left:615px; top:88px; font-size:12px;">1</span>
<span style="position:absolute; color:black; left:621px; top:88px; font-size:12px;">0</span>
<span style="position:absolute; color:black; left:627px; top:88px; font-size:12px;">.</span>
<span style="position:absolute; color:black; left:633px; top:88px; font-size:12px;">1</span>
<span style="position:absolute; color:black; left:639px; top:88px; font-size:12px;">3</span>
<span style="position:absolute; color:black; left:645px; top:88px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:651px; top:88px; font-size:12px;">0</span>
<span style="position:absolute; color:black; left:658px; top:88px; font-size:12px;">9</span>
<span style="position:absolute; color:black; left:664px; top:88px; font-size:12px;">:</span>
<span style="position:absolute; color:black; left:670px; top:88px; font-size:12px;">5</span>
<span style="position:absolute; color:black; left:676px; top:88px; font-size:12px;">9</span>
<span style="position:absolute; color:black; left:188px; top:107px; font-size:12px;">条</span>
<span style="position:absolute; color:black; left:200px; top:107px; font-size:12px;">形</span>
<span style="position:absolute; color:black; left:212px; top:107px; font-size:12px;">码</span>
<span style="position:absolute; color:black; left:224px; top:107px; font-size:12px;">号</span>
<span style="position:absolute; color:black; left:236px; top:107px; font-size:12px;">:</span>
<span style="position:absolute; color:black; left:245px; top:107px; font-size:12px;">1</span>
<span style="position:absolute; color:black; left:251px; top:107px; font-size:12px;">4</span>
<span style="position:absolute; color:black; left:257px; top:107px; font-size:12px;">3</span>
<span style="position:absolute; color:black; left:263px; top:107px; font-size:12px;">6</span>
<span style="position:absolute; color:black; left:269px; top:107px; font-size:12px;">6</span>
<span style="position:absolute; color:black; left:275px; top:107px; font-size:12px;">2</span>
<span style="position:absolute; color:black; left:281px; top:107px; font-size:12px;">0</span>
<span style="position:absolute; color:black; left:287px; top:107px; font-size:12px;">8</span>
<span style="position:absolute; color:black; left:293px; top:107px; font-size:12px;">0</span>
<span style="position:absolute; color:black; left:300px; top:107px; font-size:12px;">0</span>
<span style="position:absolute; color:black; left:188px; top:125px; font-size:12px;">床</span>
<span style="position:absolute; color:black; left:200px; top:125px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:206px; top:125px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:212px; top:125px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:218px; top:125px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:224px; top:125px; font-size:12px;">号</span>
<span style="position:absolute; color:black; left:236px; top:125px; font-size:12px;">:</span>
<span style="position:absolute; color:black; left:188px; top:145px; font-size:12px;">标</span>
<span style="position:absolute; color:black; left:200px; top:145px; font-size:12px;">本</span>
<span style="position:absolute; color:black; left:212px; top:145px; font-size:12px;">状</span>
<span style="position:absolute; color:black; left:224px; top:145px; font-size:12px;">态</span>
<span style="position:absolute; color:black; left:236px; top:145px; font-size:12px;">:</span>
<span style="position:absolute; color:black; left:245px; top:144px; font-size:12px;">合</span>
<span style="position:absolute; color:black; left:257px; top:144px; font-size:12px;">格</span>
<span style="position:absolute; color:black; left:340px; top:107px; font-size:12px;">采</span>
<span style="position:absolute; color:black; left:352px; top:107px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:358px; top:107px; font-size:12px;">集</span>
<span style="position:absolute; color:black; left:370px; top:107px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:376px; top:107px; font-size:12px;">者</span>
<span style="position:absolute; color:black; left:388px; top:107px; font-size:12px;">:</span>
<span style="position:absolute; color:black; left:402px; top:107px; font-size:12px;">黄</span>
<span style="position:absolute; color:black; left:414px; top:107px; font-size:12px;">文</span>
<span style="position:absolute; color:black; left:426px; top:107px; font-size:12px;">夏</span>
<span style="position:absolute; color:black; left:438px; top:107px; font-size:12px;">/</span>
<span style="position:absolute; color:black; left:444px; top:107px; font-size:12px;">T</span>
<span style="position:absolute; color:black; left:450px; top:107px; font-size:12px;">0</span>
<span style="position:absolute; color:black; left:456px; top:107px; font-size:12px;">0</span>
<span style="position:absolute; color:black; left:462px; top:107px; font-size:12px;">3</span>
<span style="position:absolute; color:black; left:468px; top:107px; font-size:12px;">7</span>
<span style="position:absolute; color:black; left:340px; top:125px; font-size:12px;">科</span>
<span style="position:absolute; color:black; left:352px; top:125px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:358px; top:125px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:364px; top:125px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:370px; top:125px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:376px; top:125px; font-size:12px;">别</span>
<span style="position:absolute; color:black; left:388px; top:125px; font-size:12px;">:</span>
<span style="position:absolute; color:black; left:402px; top:126px; font-size:12px;">台</span>
<span style="position:absolute; color:black; left:414px; top:126px; font-size:12px;">胞</span>
<span style="position:absolute; color:black; left:426px; top:126px; font-size:12px;">生</span>
<span style="position:absolute; color:black; left:438px; top:126px; font-size:12px;">殖</span>
<span style="position:absolute; color:black; left:450px; top:126px; font-size:12px;">中</span>
<span style="position:absolute; color:black; left:462px; top:126px; font-size:12px;">心</span>
<span style="position:absolute; color:black; left:341px; top:144px; font-size:12px;">临</span>
<span style="position:absolute; color:black; left:353px; top:144px; font-size:12px;">床</span>
<span style="position:absolute; color:black; left:365px; top:144px; font-size:12px;">诊</span>
<span style="position:absolute; color:black; left:377px; top:144px; font-size:12px;">断</span>
<span style="position:absolute; color:black; left:389px; top:144px; font-size:12px;">:</span>
<span style="position:absolute; color:black; left:402px; top:144px; font-size:12px;">女</span>
<span style="position:absolute; color:black; left:414px; top:144px; font-size:12px;">性</span>
<span style="position:absolute; color:black; left:426px; top:144px; font-size:12px;">盆</span>
<span style="position:absolute; color:black; left:438px; top:144px; font-size:12px;">腔</span>
<span style="position:absolute; color:black; left:450px; top:144px; font-size:12px;">炎</span>
<span style="position:absolute; color:black; left:462px; top:144px; font-size:12px;">性</span>
<span style="position:absolute; color:black; left:474px; top:144px; font-size:12px;">疾</span>
<span style="position:absolute; color:black; left:486px; top:144px; font-size:12px;">病</span>
<span style="position:absolute; color:black; left:523px; top:107px; font-size:12px;">采</span>
<span style="position:absolute; color:black; left:535px; top:107px; font-size:12px;">集</span>
<span style="position:absolute; color:black; left:547px; top:107px; font-size:12px;">时</span>
<span style="position:absolute; color:black; left:559px; top:107px; font-size:12px;">间</span>
<span style="position:absolute; color:black; left:571px; top:107px; font-size:12px;">:</span>
<span style="position:absolute; color:black; left:585px; top:107px; font-size:12px;">2</span>
<span style="position:absolute; color:black; left:591px; top:107px; font-size:12px;">0</span>
<span style="position:absolute; color:black; left:597px; top:107px; font-size:12px;">1</span>
<span style="position:absolute; color:black; left:603px; top:107px; font-size:12px;">5</span>
<span style="position:absolute; color:black; left:609px; top:107px; font-size:12px;">.</span>
<span style="position:absolute; color:black; left:615px; top:107px; font-size:12px;">1</span>
<span style="position:absolute; color:black; left:621px; top:107px; font-size:12px;">0</span>
<span style="position:absolute; color:black; left:627px; top:107px; font-size:12px;">.</span>
<span style="position:absolute; color:black; left:633px; top:107px; font-size:12px;">1</span>
<span style="position:absolute; color:black; left:639px; top:107px; font-size:12px;">3</span>
<span style="position:absolute; color:black; left:645px; top:107px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:652px; top:107px; font-size:12px;">1</span>
<span style="position:absolute; color:black; left:658px; top:107px; font-size:12px;">0</span>
<span style="position:absolute; color:black; left:664px; top:107px; font-size:12px;">:</span>
<span style="position:absolute; color:black; left:670px; top:107px; font-size:12px;">4</span>
<span style="position:absolute; color:black; left:676px; top:107px; font-size:12px;">5</span>
<span style="position:absolute; color:black; left:523px; top:125px; font-size:12px;">接</span>
<span style="position:absolute; color:black; left:535px; top:125px; font-size:12px;">收</span>
<span style="position:absolute; color:black; left:547px; top:125px; font-size:12px;">时</span>
<span style="position:absolute; color:black; left:559px; top:125px; font-size:12px;">间</span>
<span style="position:absolute; color:black; left:571px; top:125px; font-size:12px;">:</span>
<span style="position:absolute; color:black; left:585px; top:125px; font-size:12px;">2</span>
<span style="position:absolute; color:black; left:591px; top:125px; font-size:12px;">0</span>
<span style="position:absolute; color:black; left:597px; top:125px; font-size:12px;">1</span>
<span style="position:absolute; color:black; left:603px; top:125px; font-size:12px;">5</span>
<span style="position:absolute; color:black; left:609px; top:125px; font-size:12px;">.</span>
<span style="position:absolute; color:black; left:615px; top:125px; font-size:12px;">1</span>
<span style="position:absolute; color:black; left:621px; top:125px; font-size:12px;">0</span>
<span style="position:absolute; color:black; left:627px; top:125px; font-size:12px;">.</span>
<span style="position:absolute; color:black; left:633px; top:125px; font-size:12px;">1</span>
<span style="position:absolute; color:black; left:639px; top:125px; font-size:12px;">3</span>
<span style="position:absolute; color:black; left:645px; top:125px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:651px; top:125px; font-size:12px;">1</span>
<span style="position:absolute; color:black; left:657px; top:125px; font-size:12px;">0</span>
<span style="position:absolute; color:black; left:663px; top:125px; font-size:12px;">:</span>
<span style="position:absolute; color:black; left:670px; top:125px; font-size:12px;">5</span>
<span style="position:absolute; color:black; left:676px; top:125px; font-size:12px;">1</span>
<span style="position:absolute; color:black; left:57px; top:88px; font-size:12px;">姓</span>
<span style="position:absolute; color:black; left:69px; top:88px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:75px; top:88px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:81px; top:88px; font-size:12px;">名</span>
<span style="position:absolute; color:black; left:93px; top:88px; font-size:12px;">:</span>
<span style="position:absolute; color:black; left:108px; top:87px; font-size:16px;">陈</span>
<span style="position:absolute; color:black; left:124px; top:87px; font-size:16px;">超</span>
<span style="position:absolute; color:black; left:57px; top:107px; font-size:12px;">性</span>
<span style="position:absolute; color:black; left:69px; top:107px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:75px; top:107px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:81px; top:107px; font-size:12px;">别</span>
<span style="position:absolute; color:black; left:93px; top:107px; font-size:12px;">:</span>
<span style="position:absolute; color:black; left:108px; top:107px; font-size:12px;">女</span>
<span style="position:absolute; color:black; left:57px; top:125px; font-size:12px;">年</span>
<span style="position:absolute; color:black; left:69px; top:125px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:75px; top:125px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:81px; top:125px; font-size:12px;">龄</span>
<span style="position:absolute; color:black; left:93px; top:125px; font-size:12px;">:</span>
<span style="position:absolute; color:black; left:108px; top:125px; font-size:12px;">3</span>
<span style="position:absolute; color:black; left:114px; top:125px; font-size:12px;">3</span>
<span style="position:absolute; color:black; left:120px; top:125px; font-size:12px;">岁</span>
<span style="position:absolute; color:black; left:57px; top:144px; font-size:12px;">标</span>
<span style="position:absolute; color:black; left:69px; top:144px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:75px; top:144px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:81px; top:144px; font-size:12px;">本</span>
<span style="position:absolute; color:black; left:93px; top:144px; font-size:12px;">:</span>
<span style="position:absolute; color:black; left:107px; top:144px; font-size:12px;">血</span>
<span style="position:absolute; color:black; left:119px; top:144px; font-size:12px;">清</span>
<span style="position:absolute; color:black; left:61px; top:167px; font-size:12px;">N</span>
<span style="position:absolute; color:black; left:67px; top:167px; font-size:12px;">o</span>
<span style="position:absolute; color:black; left:85px; top:167px; font-size:12px;">项</span>
<span style="position:absolute; color:black; left:98px; top:167px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:104px; top:167px; font-size:12px;">目</span>
<span style="position:absolute; color:black; left:57px; top:189px; font-size:13px;"> </span>
<span style="position:absolute; color:black; left:64px; top:189px; font-size:13px;">1</span>
<span style="position:absolute; color:black; left:83px; top:189px; font-size:13px;">梅</span>
<span style="position:absolute; color:black; left:96px; top:189px; font-size:13px;">毒</span>
<span style="position:absolute; color:black; left:109px; top:189px; font-size:13px;">螺</span>
<span style="position:absolute; color:black; left:122px; top:189px; font-size:13px;">旋</span>
<span style="position:absolute; color:black; left:135px; top:189px; font-size:13px;">体</span>
<span style="position:absolute; color:black; left:148px; top:189px; font-size:13px;">特</span>
<span style="position:absolute; color:black; left:161px; top:189px; font-size:13px;">异</span>
<span style="position:absolute; color:black; left:174px; top:189px; font-size:13px;">性</span>
<span style="position:absolute; color:black; left:187px; top:189px; font-size:13px;">抗</span>
<span style="position:absolute; color:black; left:200px; top:189px; font-size:13px;">体</span>
<span style="position:absolute; color:black; left:292px; top:167px; font-size:12px;">结</span>
<span style="position:absolute; color:black; left:304px; top:167px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:310px; top:167px; font-size:12px;">果</span>
<span style="position:absolute; color:black; left:291px; top:189px; font-size:13px;">0</span>
<span style="position:absolute; color:black; left:297px; top:189px; font-size:13px;">.</span>
<span style="position:absolute; color:black; left:304px; top:189px; font-size:13px;">0</span>
<span style="position:absolute; color:black; left:311px; top:189px; font-size:13px;">5</span>
<span style="position:absolute; color:black; left:318px; top:189px; font-size:13px;">(</span>
<span style="position:absolute; color:black; left:325px; top:189px; font-size:13px;">-</span>
<span style="position:absolute; color:black; left:332px; top:189px; font-size:13px;">)</span>
<span style="position:absolute; color:black; left:406px; top:167px; font-size:12px;">参</span>
<span style="position:absolute; color:black; left:418px; top:167px; font-size:12px;">考</span>
<span style="position:absolute; color:black; left:430px; top:167px; font-size:12px;">区</span>
<span style="position:absolute; color:black; left:442px; top:167px; font-size:12px;">间</span>
<span style="position:absolute; color:black; left:407px; top:187px; font-size:13px;">&lt;</span>
<span style="position:absolute; color:black; left:413px; top:187px; font-size:13px;">1</span>
<span style="position:absolute; color:black; left:420px; top:187px; font-size:13px;">.</span>
<span style="position:absolute; color:black; left:427px; top:187px; font-size:13px;">0</span>
<span style="position:absolute; color:black; left:434px; top:187px; font-size:13px;">0</span>
<span style="position:absolute; color:black; left:502px; top:167px; font-size:12px;">单</span>
<span style="position:absolute; color:black; left:514px; top:167px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:520px; top:167px; font-size:12px;">位</span>
<span style="position:absolute; color:black; left:555px; top:167px; font-size:12px;">仪</span>
<span style="position:absolute; color:black; left:567px; top:167px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:573px; top:167px; font-size:12px;">器</span>
<span style="position:absolute; color:black; left:624px; top:167px; font-size:12px;">方</span>
<span style="position:absolute; color:black; left:636px; top:167px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:642px; top:167px; font-size:12px;">法</span>
<span style="position:absolute; color:black; left:502px; top:189px; font-size:13px;">s</span>
<span style="position:absolute; color:black; left:509px; top:189px; font-size:13px;">/</span>
<span style="position:absolute; color:black; left:516px; top:189px; font-size:13px;">c</span>
<span style="position:absolute; color:black; left:523px; top:189px; font-size:13px;">o</span>
<span style="position:absolute; color:black; left:554px; top:188px; font-size:13px;">I</span>
<span style="position:absolute; color:black; left:561px; top:188px; font-size:13px;">2</span>
<span style="position:absolute; color:black; left:568px; top:188px; font-size:13px;">0</span>
<span style="position:absolute; color:black; left:575px; top:188px; font-size:13px;">0</span>
<span style="position:absolute; color:black; left:582px; top:188px; font-size:13px;">0</span>
<span style="position:absolute; color:black; left:589px; top:188px; font-size:13px;"> </span>
<span style="position:absolute; color:black; left:596px; top:188px; font-size:13px;"> </span>
<span style="position:absolute; color:black; left:602px; top:188px; font-size:13px;"> </span>
<span style="position:absolute; color:black; left:609px; top:188px; font-size:13px;">化</span>
<span style="position:absolute; color:black; left:622px; top:188px; font-size:13px;">学</span>
<span style="position:absolute; color:black; left:635px; top:188px; font-size:13px;">发</span>
<span style="position:absolute; color:black; left:648px; top:188px; font-size:13px;">光</span>
<span style="position:absolute; color:black; left:661px; top:188px; font-size:13px;">法</span>
<span style="position:absolute; color:black; left:60px; top:515px; font-size:12px;">备</span>
<span style="position:absolute; color:black; left:72px; top:515px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:78px; top:515px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:84px; top:515px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:90px; top:515px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:96px; top:515px; font-size:12px;">注</span>
<span style="position:absolute; color:black; left:108px; top:515px; font-size:12px;">:</span>
<span style="position:absolute; color:black; left:54px; top:534px; font-size:12px;">※</span>
<span style="position:absolute; color:black; left:66px; top:534px; font-size:12px;">结</span>
<span style="position:absolute; color:black; left:78px; top:534px; font-size:12px;">果</span>
<span style="position:absolute; color:black; left:90px; top:534px; font-size:12px;">仅</span>
<span style="position:absolute; color:black; left:102px; top:534px; font-size:12px;">对</span>
<span style="position:absolute; color:black; left:114px; top:534px; font-size:12px;">送</span>
<span style="position:absolute; color:black; left:126px; top:534px; font-size:12px;">检</span>
<span style="position:absolute; color:black; left:139px; top:534px; font-size:12px;">标</span>
<span style="position:absolute; color:black; left:151px; top:534px; font-size:12px;">本</span>
<span style="position:absolute; color:black; left:163px; top:534px; font-size:12px;">负</span>
<span style="position:absolute; color:black; left:175px; top:534px; font-size:12px;">责</span>
<span style="position:absolute; color:black; left:187px; top:534px; font-size:12px;">,</span>
<span style="position:absolute; color:black; left:199px; top:534px; font-size:12px;">有</span>
<span style="position:absolute; color:black; left:211px; top:534px; font-size:12px;">疑</span>
<span style="position:absolute; color:black; left:223px; top:534px; font-size:12px;">问</span>
<span style="position:absolute; color:black; left:235px; top:534px; font-size:12px;">请</span>
<span style="position:absolute; color:black; left:247px; top:534px; font-size:12px;">于</span>
<span style="position:absolute; color:black; left:260px; top:534px; font-size:12px;">3</span>
<span style="position:absolute; color:black; left:266px; top:534px; font-size:12px;">日</span>
<span style="position:absolute; color:black; left:278px; top:534px; font-size:12px;">内</span>
<span style="position:absolute; color:black; left:290px; top:534px; font-size:12px;">咨</span>
<span style="position:absolute; color:black; left:302px; top:534px; font-size:12px;">询</span>
<span style="position:absolute; color:black; left:54px; top:548px; font-size:12px;">※</span>
<span style="position:absolute; color:black; left:66px; top:548px; font-size:12px;">带</span>
<span style="position:absolute; color:black; left:78px; top:548px; font-size:12px;">"</span>
<span style="position:absolute; color:black; left:84px; top:548px; font-size:12px;">*</span>
<span style="position:absolute; color:black; left:90px; top:548px; font-size:12px;">"</span>
<span style="position:absolute; color:black; left:96px; top:548px; font-size:12px;">结</span>
<span style="position:absolute; color:black; left:108px; top:548px; font-size:12px;">果</span>
<span style="position:absolute; color:black; left:120px; top:548px; font-size:12px;">按</span>
<span style="position:absolute; color:black; left:132px; top:548px; font-size:12px;">卫</span>
<span style="position:absolute; color:black; left:144px; top:548px; font-size:12px;">生</span>
<span style="position:absolute; color:black; left:157px; top:548px; font-size:12px;">厅</span>
<span style="position:absolute; color:black; left:169px; top:548px; font-size:12px;">规</span>
<span style="position:absolute; color:black; left:181px; top:548px; font-size:12px;">定</span>
<span style="position:absolute; color:black; left:193px; top:548px; font-size:12px;">参</span>
<span style="position:absolute; color:black; left:205px; top:548px; font-size:12px;">加</span>
<span style="position:absolute; color:black; left:217px; top:548px; font-size:12px;">互</span>
<span style="position:absolute; color:black; left:229px; top:548px; font-size:12px;">认</span>
<span style="position:absolute; color:black; left:340px; top:535px; font-size:12px;">报</span>
<span style="position:absolute; color:black; left:352px; top:535px; font-size:12px;">告</span>
<span style="position:absolute; color:black; left:364px; top:535px; font-size:12px;">时</span>
<span style="position:absolute; color:black; left:376px; top:535px; font-size:12px;">间</span>
<span style="position:absolute; color:black; left:388px; top:535px; font-size:12px;">:</span>
<span style="position:absolute; color:black; left:395px; top:534px; font-size:12px;">2</span>
<span style="position:absolute; color:black; left:401px; top:534px; font-size:12px;">0</span>
<span style="position:absolute; color:black; left:407px; top:534px; font-size:12px;">1</span>
<span style="position:absolute; color:black; left:413px; top:534px; font-size:12px;">5</span>
<span style="position:absolute; color:black; left:419px; top:534px; font-size:12px;">.</span>
<span style="position:absolute; color:black; left:425px; top:534px; font-size:12px;">1</span>
<span style="position:absolute; color:black; left:431px; top:534px; font-size:12px;">0</span>
<span style="position:absolute; color:black; left:437px; top:534px; font-size:12px;">.</span>
<span style="position:absolute; color:black; left:443px; top:534px; font-size:12px;">1</span>
<span style="position:absolute; color:black; left:449px; top:534px; font-size:12px;">3</span>
<span style="position:absolute; color:black; left:455px; top:534px; font-size:12px;"> </span>
<span style="position:absolute; color:black; left:461px; top:534px; font-size:12px;">1</span>
<span style="position:absolute; color:black; left:468px; top:534px; font-size:12px;">4</span>
<span style="position:absolute; color:black; left:474px; top:534px; font-size:12px;">:</span>
<span style="position:absolute; color:black; left:480px; top:534px; font-size:12px;">5</span>
<span style="position:absolute; color:black; left:486px; top:534px; font-size:12px;">6</span>
<span style="position:absolute; color:black; left:506px; top:534px; font-size:12px;">检</span>
<span style="position:absolute; color:black; left:518px; top:534px; font-size:12px;">验</span>
<span style="position:absolute; color:black; left:530px; top:534px; font-size:12px;">者</span>
<span style="position:absolute; color:black; left:542px; top:534px; font-size:12px;">:</span>
<span style="position:absolute; color:black; left:549px; top:534px; font-size:12px;">陈</span>
<span style="position:absolute; color:black; left:561px; top:534px; font-size:12px;">静</span>
<span style="position:absolute; color:black; left:612px; top:534px; font-size:12px;">核</span>
<span style="position:absolute; color:black; left:624px; top:534px; font-size:12px;">对</span>
<span style="position:absolute; color:black; left:636px; top:534px; font-size:12px;">者</span>
<span style="position:absolute; color:black; left:648px; top:534px; font-size:12px;">:</span>
<span style="position:absolute; color:black; left:656px; top:534px; font-size:12px;">林</span>
<span style="position:absolute; color:black; left:668px; top:534px; font-size:12px;">永</span>
<span style="position:absolute; color:black; left:680px; top:534px; font-size:12px;">梅</span>
<span style="position:absolute; border: black 1px solid; left:56px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:58px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:60px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:61px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:62px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:64px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:67px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:69px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:70px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:73px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:74px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:76px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:77px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:80px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:82px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:84px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:86px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:88px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:89px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:92px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:93px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:95px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:97px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:99px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:102px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:104px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:105px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:107px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:108px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:111px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:112px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:114px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:116px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:117px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:120px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:121px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:124px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:126px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:127px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:129px; top:62px; width:0px; height:20px;"></span>
<span style="position:absolute; border: black 1px solid; left:57px; top:162px; width:631px; height:0px;"></span>
<span style="position:absolute; border: black 1px solid; left:291px; top:187px; width:80px; height:18px;"></span>
<span style="position:absolute; border: black 1px solid; left:55px; top:180px; width:633px; height:0px;"></span>
<span style="position:absolute; border: black 1px solid; left:54px; top:531px; width:659px; height:0px;"></span>
<span style="position:absolute; border: black 1px solid; left:54px; top:533px; width:278px; height:13px;"></span>
<span style="position:absolute; border: black 1px solid; left:54px; top:547px; width:255px; height:13px;"></span>
<div style="position:absolute; border: figure 1px solid; writing-mode:False; left:655px; top:524px; width:65px; height:28px;">
</div>
<div style="position:absolute; top:0px;">Page: <a href="#1">1</a></div>

</body></html>
wanghaisheng commented 7 years ago

问题1 1.如何移植最新版本的pdf.js到pdf2json库中来 2.如何实现 坐标换算 参考源码 x/px = PDFUnit.toPixelX( x+0.25) y/px = PDFUnit.toPixelY( x+0.75)

w/px= PDFUnit.toFixedFloat(maxWidth),

The unit for all width, height, length, etc, is in "Form Unit". If you need pixels value, you can use the converter below: https://github.com/modesty/pdf2json/issues/4

https://github.com/modesty/pdf2json/blob/3fe724db05659ad12c2c0f1b019530c906ad23de/lib/pdffont.js https://github.com/modesty/pdf2json/blob/3fe724db05659ad12c2c0f1b019530c906ad23de/lib/pdfunit.js


    let dpi = 96.0;
    let gridXPerInch = 4.0;
    let gridYPerInch = 4.0;

    let _pixelXPerGrid = dpi/gridXPerInch;
    let _pixelYPerGrid = dpi/gridYPerInch;
let _pixelPerPoint = dpi/72;

 let TS = [this.faceIdx, this.fontSize, this.bold?1:0, this.italic?1:0];
        let clrId = PDFUnit.findColorIndex(color);

        let oneText = {x: PDFUnit.toFormX(p.x) - 0.25,
            y: PDFUnit.toFormY(p.y) - 0.75,
            w: PDFUnit.toFixedFloat(maxWidth),
            sw: this.spaceWidth, //font space width, use to merge adjacent text blocks
            clr: clrId,
            A: "left",
            R: [{
                T: this.flash_encode(text),
                S: this.fontStyleId,
                TS: TS
}]

3.如何借鉴基于pdf2json的其他一些库的算法 https://github.com/SamDecrock/pdf2table https://github.com/TennisVisuals/pdfRuler https://github.com/barkbarkuk/barclaysStatement2json 一个名字一模一样但是c++的库https://github.com/flexpaper/pdf2json https://github.com/barkbarkuk/barclaysStatement2json https://github.com/matjaz/kam-jest https://github.com/medicaremojo/document-parser 做的是医保相关的pdf文档解析 https://github.com/garysieling/pdf-js-csv 这个作者的博客有一系列使用pdfjs解析pdf的文章 包括文本和表格 https://github.com/isaacmast/linkedin-pdf-to-json https://github.com/giang-pham/clinic-list https://github.com/bhaskar20/pdf-parse https://github.com/AndyLc/doc-formatter https://github.com/EGWeeks/AMAPDFtoJSONParser

URL decode 之后pdf2json的结果

{
    "formImage": {
        "Transcoder": "pdf2json@1.1.6 [https://github.com/modesty/pdf2json]",
        "Agency": "",
        "Id": {
            "AgencyId": "",
            "Name": "",
            "MC": false,
            "Max": 1,
            "Parent": ""
        },
        "Pages": [
            {
                "Height": 34.96,
                "HLines": [
                    {
                        "x": 3.596,
                        "y": 7.062,
                        "w": 2.877,
                        "l": 39.455
                    },
                    {
                        "x": 3.466,
                        "y": 8.181,
                        "w": 1.438,
                        "l": 39.585
                    },
                    {
                        "x": 3.406,
                        "y": 30.095,
                        "w": 2.877,
                        "l": 41.203
                    }
                ],
                "VLines": [
                    {
                        "x": 3.526,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 3.646,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 3.766,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 3.826,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 3.936,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 4.055,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 4.235,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 4.355,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 4.415,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 4.585,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 4.645,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 4.765,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 4.824,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 5.004,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 5.174,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 5.294,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 5.414,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 5.534,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 5.593,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 5.763,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 5.823,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 5.943,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 6.063,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 6.243,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 6.413,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 6.532,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 6.593,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 6.712,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 6.772,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 6.952,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 7.062,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 7.182,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 7.302,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 7.362,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 7.541,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 7.591,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 7.771,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 7.891,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 7.951,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 8.071,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    }
                ],
                "Fills": [
                    {
                        "x": 0,
                        "y": 0,
                        "w": 0,
                        "h": 0,
                        "clr": 1
                    },
                    {
                        "x": 18.199,
                        "y": 8.57,
                        "w": 5.054,
                        "h": 1.178,
                        "clr": 1
                    },
                    {
                        "x": 3.406,
                        "y": 30.215,
                        "w": 17.42,
                        "h": 0.869,
                        "clr": 1
                    },
                    {
                        "x": 3.406,
                        "y": 31.095,
                        "w": 15.982,
                        "h": 0.869,
                        "clr": 1
                    }
                ],
                "Texts": [
                    {
                        "x": 12.006,
                        "y": 1.038,
                        "w": 17,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "福建医科大学附属第一医院检验报告单",
                                "S": -1,
                                "TS": [
                                    3,
                                    24.275657,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 39.435,
                        "y": 0.30899999999999994,
                        "w": 4,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "【免疫】",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 42.441,
                        "y": 0.30899999999999994,
                        "w": 2.5,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "10.13",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 41.422,
                        "y": 1.148,
                        "w": 1,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "NO",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 42.201,
                        "y": 1.148,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": ".",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 42.959,
                        "y": 1.148,
                        "w": 2,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "1269",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 3.336,
                        "y": 2.326,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "姓",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 4.836,
                        "y": 2.326,
                        "w": 2,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "名:",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 6.532,
                        "y": 2.456,
                        "w": 2,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "陈超",
                                "S": -1,
                                "TS": [
                                    3,
                                    18.996696999999998,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 11.527,
                        "y": 2.326,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "门",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 12.659,
                        "y": 2.326,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "诊",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 13.784,
                        "y": 2.326,
                        "w": 2,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "号:",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 15.082,
                        "y": 2.326,
                        "w": 5,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "3037089460",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 20.961,
                        "y": 2.326,
                        "w": 8,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "申请医生:郭玉佳",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 26.96,
                        "y": 2.326,
                        "w": 2.5,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "/2985",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 32.254,
                        "y": 2.326,
                        "w": 5,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "申请时间:",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 36.124,
                        "y": 2.326,
                        "w": 8,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "2015.10.13 09:59",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 3.336,
                        "y": 3.5149999999999997,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "性",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 4.836,
                        "y": 3.5149999999999997,
                        "w": 2,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "别:",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 6.531,
                        "y": 3.5149999999999997,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "女",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 11.525,
                        "y": 3.5149999999999997,
                        "w": 5,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "条形码号:",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 15.087,
                        "y": 3.5149999999999997,
                        "w": 5,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "1436620800",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 20.958,
                        "y": 3.5149999999999997,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "采",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 22.083,
                        "y": 3.5149999999999997,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "集",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 23.216,
                        "y": 3.5149999999999997,
                        "w": 5,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "者:黄文夏",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 26.965,
                        "y": 3.5149999999999997,
                        "w": 3,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "/T0037",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 32.259,
                        "y": 3.5149999999999997,
                        "w": 5,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "采集时间:",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 36.129,
                        "y": 3.5149999999999997,
                        "w": 8,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "2015.10.13 10:45",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 3.336,
                        "y": 4.634,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "年",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 4.836,
                        "y": 4.634,
                        "w": 2,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "龄:",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 6.531,
                        "y": 4.634,
                        "w": 1,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "33",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 7.288,
                        "y": 4.634,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "岁",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 11.525,
                        "y": 4.634,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "床",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 13.775,
                        "y": 4.634,
                        "w": 2,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "号:",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 20.996,
                        "y": 4.634,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "科",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 23.253,
                        "y": 4.634,
                        "w": 2,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "别:",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 24.881,
                        "y": 4.694,
                        "w": 6,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "台胞生殖中心",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 32.443,
                        "y": 4.634,
                        "w": 5,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "接收时间:",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 36.313,
                        "y": 4.634,
                        "w": 8,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "2015.10.13 10:51",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 3.336,
                        "y": 5.822,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "标",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 4.836,
                        "y": 5.822,
                        "w": 4,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "本:血清",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 11.527,
                        "y": 5.852,
                        "w": 5,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "标本状态:",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 15.082,
                        "y": 5.822,
                        "w": 2,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "合格",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 21.074,
                        "y": 5.822,
                        "w": 13,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "临床诊断:女性盆腔炎性疾病",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 3.336,
                        "y": 8.669,
                        "w": 1,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": " 1",
                                "S": -1,
                                "TS": [
                                    3,
                                    16.277309000000002,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 4.964,
                        "y": 8.699,
                        "w": 10,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "梅毒螺旋体特异性抗体",
                                "S": -1,
                                "TS": [
                                    3,
                                    16.277309000000002,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 25.191,
                        "y": 8.529,
                        "w": 2.5,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "<1.00",
                                "S": -1,
                                "TS": [
                                    3,
                                    16.277309000000002,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 31.184,
                        "y": 8.699,
                        "w": 2,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "s/co",
                                "S": -1,
                                "TS": [
                                    3,
                                    16.277309000000002,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 34.43,
                        "y": 8.639,
                        "w": 4,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "I2000   ",
                                "S": -1,
                                "TS": [
                                    3,
                                    16.277309000000002,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 37.758,
                        "y": 8.639,
                        "w": 5,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "化学发光法",
                                "S": -1,
                                "TS": [
                                    3,
                                    16.277309000000002,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 17.949,
                        "y": 8.699,
                        "w": 3.5,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "0.05(-)",
                                "S": -1,
                                "TS": [
                                    3,
                                    16.277309000000002,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 3.156,
                        "y": 30.165,
                        "w": 17,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "※结果仅对送检标本负责,有疑问请于",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 15.911000000000001,
                        "y": 30.165,
                        "w": 0.5,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "3",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 16.286,
                        "y": 30.165,
                        "w": 4,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "日内咨询",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 3.156,
                        "y": 31.044,
                        "w": 2,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "※带",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 4.663,
                        "y": 31.044,
                        "w": 1.5,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": ""*"",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 5.788,
                        "y": 31.044,
                        "w": 12,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "结果按卫生厅规定参加互认",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 21.016,
                        "y": 30.235,
                        "w": 5,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "报告时间:",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 24.442,
                        "y": 30.205,
                        "w": 8,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "2015.10.13 14:56",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 31.303,
                        "y": 30.205,
                        "w": 6,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "检验者:陈静",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 38.299,
                        "y": 30.205,
                        "w": 7,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "核对者:林永梅",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 3.576,
                        "y": 7.260999999999999,
                        "w": 1,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "No",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 5.12,
                        "y": 7.260999999999999,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "项",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 6.253,
                        "y": 7.260999999999999,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "目",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 18.033,
                        "y": 7.260999999999999,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "结",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 19.165,
                        "y": 7.260999999999999,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "果",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 25.126,
                        "y": 7.260999999999999,
                        "w": 4,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "参考区间",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 31.178,
                        "y": 7.260999999999999,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "单",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 32.31,
                        "y": 7.260999999999999,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "位",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 34.44,
                        "y": 7.260999999999999,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "仪",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 35.572,
                        "y": 7.260999999999999,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "器",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 38.789,
                        "y": 7.260999999999999,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "方",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 39.921,
                        "y": 7.260999999999999,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "法",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 3.506,
                        "y": 28.986,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "备",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 5.763,
                        "y": 28.986,
                        "w": 2,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "注:",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    }
                ],
                "Fields": [],
                "Boxsets": []
            }
        ],
        "Width": 49.61
    }
}
{
    "formImage": {
        "Transcoder": "pdf2json@1.1.6 [https://github.com/modesty/pdf2json]",
        "Agency": "",
        "Id": {
            "AgencyId": "",
            "Name": "",
            "MC": false,
            "Max": 1,
            "Parent": ""
        },
        "Pages": [
            {
                "Height": 34.96,
                "HLines": [
                    {
                        "x": 3.596,
                        "y": 7.062,
                        "w": 2.877,
                        "l": 39.455
                    },
                    {
                        "x": 3.466,
                        "y": 8.181,
                        "w": 1.438,
                        "l": 39.585
                    },
                    {
                        "x": 3.406,
                        "y": 30.095,
                        "w": 2.877,
                        "l": 41.203
                    }
                ],
                "VLines": [
                    {
                        "x": 3.526,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 3.646,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 3.766,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 3.826,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 3.936,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 4.055,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 4.235,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 4.355,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 4.415,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 4.585,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 4.645,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 4.765,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 4.824,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 5.004,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 5.174,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 5.294,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 5.414,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 5.534,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 5.593,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 5.763,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 5.823,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 5.943,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 6.063,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 6.243,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 6.413,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 6.532,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 6.593,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 6.712,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 6.772,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 6.952,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 7.062,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 7.182,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 7.302,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 7.362,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 7.541,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 7.591,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 7.771,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 7.891,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 7.951,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    },
                    {
                        "x": 8.071,
                        "y": 0.759,
                        "w": 1.438,
                        "l": 1.309
                    }
                ],
                "Fills": [
                    {
                        "x": 0,
                        "y": 0,
                        "w": 0,
                        "h": 0,
                        "clr": 1
                    },
                    {
                        "x": 18.199,
                        "y": 8.57,
                        "w": 5.054,
                        "h": 1.178,
                        "clr": 1
                    },
                    {
                        "x": 3.406,
                        "y": 30.215,
                        "w": 17.42,
                        "h": 0.869,
                        "clr": 1
                    },
                    {
                        "x": 3.406,
                        "y": 31.095,
                        "w": 15.982,
                        "h": 0.869,
                        "clr": 1
                    }
                ],
                "Texts": [
                    {
                        "x": 12.006,
                        "y": 1.038,
                        "w": 17,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E7%A6%8F%E5%BB%BA%E5%8C%BB%E7%A7%91%E5%A4%A7%E5%AD%A6%E9%99%84%E5%B1%9E%E7%AC%AC%E4%B8%80%E5%8C%BB%E9%99%A2%E6%A3%80%E9%AA%8C%E6%8A%A5%E5%91%8A%E5%8D%95",
                                "S": -1,
                                "TS": [
                                    3,
                                    24.275657,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 39.435,
                        "y": 0.30899999999999994,
                        "w": 4,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E3%80%90%E5%85%8D%E7%96%AB%E3%80%91",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 42.441,
                        "y": 0.30899999999999994,
                        "w": 2.5,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "10.13",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 41.422,
                        "y": 1.148,
                        "w": 1,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "NO",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 42.201,
                        "y": 1.148,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%EF%BC%8E",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 42.959,
                        "y": 1.148,
                        "w": 2,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "1269",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 3.336,
                        "y": 2.326,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E5%A7%93",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 4.836,
                        "y": 2.326,
                        "w": 2,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E5%90%8D%EF%BC%9A",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 6.532,
                        "y": 2.456,
                        "w": 2,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E9%99%88%E8%B6%85",
                                "S": -1,
                                "TS": [
                                    3,
                                    18.996696999999998,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 11.527,
                        "y": 2.326,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E9%97%A8",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 12.659,
                        "y": 2.326,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E8%AF%8A",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 13.784,
                        "y": 2.326,
                        "w": 2,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E5%8F%B7%EF%BC%9A",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 15.082,
                        "y": 2.326,
                        "w": 5,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "3037089460",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 20.961,
                        "y": 2.326,
                        "w": 8,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E7%94%B3%E8%AF%B7%E5%8C%BB%E7%94%9F%EF%BC%9A%E9%83%AD%E7%8E%89%E4%BD%B3",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 26.96,
                        "y": 2.326,
                        "w": 2.5,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%2F2985",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 32.254,
                        "y": 2.326,
                        "w": 5,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E7%94%B3%E8%AF%B7%E6%97%B6%E9%97%B4%EF%BC%9A",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 36.124,
                        "y": 2.326,
                        "w": 8,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "2015.10.13%2009%3A59",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 3.336,
                        "y": 3.5149999999999997,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E6%80%A7",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 4.836,
                        "y": 3.5149999999999997,
                        "w": 2,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E5%88%AB%EF%BC%9A",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 6.531,
                        "y": 3.5149999999999997,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E5%A5%B3",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 11.525,
                        "y": 3.5149999999999997,
                        "w": 5,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E6%9D%A1%E5%BD%A2%E7%A0%81%E5%8F%B7%EF%BC%9A",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 15.087,
                        "y": 3.5149999999999997,
                        "w": 5,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "1436620800",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 20.958,
                        "y": 3.5149999999999997,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E9%87%87",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 22.083,
                        "y": 3.5149999999999997,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E9%9B%86",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 23.216,
                        "y": 3.5149999999999997,
                        "w": 5,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E8%80%85%EF%BC%9A%E9%BB%84%E6%96%87%E5%A4%8F",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 26.965,
                        "y": 3.5149999999999997,
                        "w": 3,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%2FT0037",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 32.259,
                        "y": 3.5149999999999997,
                        "w": 5,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E9%87%87%E9%9B%86%E6%97%B6%E9%97%B4%EF%BC%9A",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 36.129,
                        "y": 3.5149999999999997,
                        "w": 8,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "2015.10.13%2010%3A45",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 3.336,
                        "y": 4.634,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E5%B9%B4",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 4.836,
                        "y": 4.634,
                        "w": 2,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E9%BE%84%EF%BC%9A",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 6.531,
                        "y": 4.634,
                        "w": 1,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "33",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 7.288,
                        "y": 4.634,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E5%B2%81",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 11.525,
                        "y": 4.634,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E5%BA%8A",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 13.775,
                        "y": 4.634,
                        "w": 2,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E5%8F%B7%EF%BC%9A",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 20.996,
                        "y": 4.634,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E7%A7%91",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 23.253,
                        "y": 4.634,
                        "w": 2,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E5%88%AB%EF%BC%9A",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 24.881,
                        "y": 4.694,
                        "w": 6,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E5%8F%B0%E8%83%9E%E7%94%9F%E6%AE%96%E4%B8%AD%E5%BF%83",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 32.443,
                        "y": 4.634,
                        "w": 5,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E6%8E%A5%E6%94%B6%E6%97%B6%E9%97%B4%EF%BC%9A",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 36.313,
                        "y": 4.634,
                        "w": 8,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "2015.10.13%2010%3A51",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 3.336,
                        "y": 5.822,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E6%A0%87",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 4.836,
                        "y": 5.822,
                        "w": 4,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E6%9C%AC%EF%BC%9A%E8%A1%80%E6%B8%85",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 11.527,
                        "y": 5.852,
                        "w": 5,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E6%A0%87%E6%9C%AC%E7%8A%B6%E6%80%81%EF%BC%9A",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 15.082,
                        "y": 5.822,
                        "w": 2,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E5%90%88%E6%A0%BC",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 21.074,
                        "y": 5.822,
                        "w": 13,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E4%B8%B4%E5%BA%8A%E8%AF%8A%E6%96%AD%EF%BC%9A%E5%A5%B3%E6%80%A7%E7%9B%86%E8%85%94%E7%82%8E%E6%80%A7%E7%96%BE%E7%97%85",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 3.336,
                        "y": 8.669,
                        "w": 1,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%201",
                                "S": -1,
                                "TS": [
                                    3,
                                    16.277309000000002,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 4.964,
                        "y": 8.699,
                        "w": 10,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E6%A2%85%E6%AF%92%E8%9E%BA%E6%97%8B%E4%BD%93%E7%89%B9%E5%BC%82%E6%80%A7%E6%8A%97%E4%BD%93",
                                "S": -1,
                                "TS": [
                                    3,
                                    16.277309000000002,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 25.191,
                        "y": 8.529,
                        "w": 2.5,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%3C1.00",
                                "S": -1,
                                "TS": [
                                    3,
                                    16.277309000000002,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 31.184,
                        "y": 8.699,
                        "w": 2,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "s%2Fco",
                                "S": -1,
                                "TS": [
                                    3,
                                    16.277309000000002,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 34.43,
                        "y": 8.639,
                        "w": 4,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "I2000%20%20%20",
                                "S": -1,
                                "TS": [
                                    3,
                                    16.277309000000002,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 37.758,
                        "y": 8.639,
                        "w": 5,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E5%8C%96%E5%AD%A6%E5%8F%91%E5%85%89%E6%B3%95",
                                "S": -1,
                                "TS": [
                                    3,
                                    16.277309000000002,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 17.949,
                        "y": 8.699,
                        "w": 3.5,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "0.05(-)",
                                "S": -1,
                                "TS": [
                                    3,
                                    16.277309000000002,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 3.156,
                        "y": 30.165,
                        "w": 17,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E2%80%BB%E7%BB%93%E6%9E%9C%E4%BB%85%E5%AF%B9%E9%80%81%E6%A3%80%E6%A0%87%E6%9C%AC%E8%B4%9F%E8%B4%A3%EF%BC%8C%E6%9C%89%E7%96%91%E9%97%AE%E8%AF%B7%E4%BA%8E",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 15.911000000000001,
                        "y": 30.165,
                        "w": 0.5,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "3",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 16.286,
                        "y": 30.165,
                        "w": 4,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E6%97%A5%E5%86%85%E5%92%A8%E8%AF%A2",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 3.156,
                        "y": 31.044,
                        "w": 2,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E2%80%BB%E5%B8%A6",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 4.663,
                        "y": 31.044,
                        "w": 1.5,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%22*%22",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 5.788,
                        "y": 31.044,
                        "w": 12,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E7%BB%93%E6%9E%9C%E6%8C%89%E5%8D%AB%E7%94%9F%E5%8E%85%E8%A7%84%E5%AE%9A%E5%8F%82%E5%8A%A0%E4%BA%92%E8%AE%A4",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 21.016,
                        "y": 30.235,
                        "w": 5,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E6%8A%A5%E5%91%8A%E6%97%B6%E9%97%B4%EF%BC%9A",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 24.442,
                        "y": 30.205,
                        "w": 8,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "2015.10.13%2014%3A56",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 31.303,
                        "y": 30.205,
                        "w": 6,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E6%A3%80%E9%AA%8C%E8%80%85%EF%BC%9A%E9%99%88%E9%9D%99",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 38.299,
                        "y": 30.205,
                        "w": 7,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E6%A0%B8%E5%AF%B9%E8%80%85%EF%BC%9A%E6%9E%97%E6%B0%B8%E6%A2%85",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 3.576,
                        "y": 7.260999999999999,
                        "w": 1,
                        "sw": 0.65103125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "No",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 5.12,
                        "y": 7.260999999999999,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E9%A1%B9",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 6.253,
                        "y": 7.260999999999999,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E7%9B%AE",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 18.033,
                        "y": 7.260999999999999,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E7%BB%93",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 19.165,
                        "y": 7.260999999999999,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E6%9E%9C",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 25.126,
                        "y": 7.260999999999999,
                        "w": 4,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E5%8F%82%E8%80%83%E5%8C%BA%E9%97%B4",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 31.178,
                        "y": 7.260999999999999,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E5%8D%95",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 32.31,
                        "y": 7.260999999999999,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E4%BD%8D",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 34.44,
                        "y": 7.260999999999999,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E4%BB%AA",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 35.572,
                        "y": 7.260999999999999,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E5%99%A8",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 38.789,
                        "y": 7.260999999999999,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E6%96%B9",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 39.921,
                        "y": 7.260999999999999,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E6%B3%95",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 3.506,
                        "y": 28.986,
                        "w": 1,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E5%A4%87",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    },
                    {
                        "x": 5.763,
                        "y": 28.986,
                        "w": 2,
                        "sw": 0.32553125,
                        "clr": 0,
                        "A": "left",
                        "R": [
                            {
                                "T": "%E6%B3%A8%EF%BC%9A",
                                "S": -1,
                                "TS": [
                                    3,
                                    14.997573,
                                    0,
                                    0
                                ]
                            }
                        ]
                    }
                ],
                "Fields": [],
                "Boxsets": []
            }
        ],
        "Width": 49.61
    }
}
wanghaisheng commented 7 years ago
from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
from pdfminer.converter import TextConverter
from pdfminer.layout import LAParams
from pdfminer.pdfpage import PDFPage
from cStringIO import StringIO

def convert_pdf_to_txt(path):
    rsrcmgr = PDFResourceManager()
    retstr = StringIO()
    codec = 'utf-8'
    laparams = LAParams()
    device = TextConverter(rsrcmgr, retstr, codec=codec, laparams=laparams)

    fp = file(path, 'rb')

    parser = PDFParser(fp)
    doc = PDFDocument(parser)
    parser.set_document(doc)

    interpreter = PDFPageInterpreter(rsrcmgr, device)
    password = ""
    maxpages = 0
    caching = True
    pagenos=set()

    for page in PDFPage.get_pages(fp, pagenos, maxpages=maxpages,        password=password,caching=caching, check_extractable=True):
        interpreter.process_page(page)

    text = retstr.getvalue()

    fp.close()
    device.close()
    retstr.close()
    print text
    return text
wanghaisheng commented 7 years ago

http://stackoverflow.com/questions/9942594/unicodeencodeerror-ascii-codec-cant-encode-character-u-xa0-in-position-20?rq=1 https://github.com/Micka33/content-extractor

wanghaisheng commented 7 years ago

Not an issue, just a note that this is possible and that informal testing suggests this provides around a 33-40% speedup in terms of PDF parsing and processing.

I was following ideas from http://stackoverflow.com/questions/11507101/how-to-compile-and-link-multiple-python-modules-or-packages-using-cython

Steps
1. Rename the .py files, apart from __init__.py inside the main pdfminer module directory (NB: this may not be necessary, Cython may work with .py files but I haven't tested this)
2. Create a setup.py inside the main pdfminer module directory, i.e.

from distutils.core import setup
from Cython.Build import cythonize

setup(
name = 'pdfminer',
    ext_modules = cythonize("*.pyx")
)

    Compile with python setup.py build_ext --inplace
    This will create an additional pdfminer directory containing compiled .so files
    Move this pdfminer directory with the .so files into your project structure and import as normal
wanghaisheng commented 7 years ago

ocrmypdf

测试对于图片类型的输入 图片

➜  test-data git:(master) ✗ docker run -v "$(pwd):/home/docker"  ocrmypdf --output-type pdf --pdf-renderer tesseract --force-ocr -l  chi_sim 1.png    1-pic-redo.pdf
   INFO - Input file is not a PDF, checking if it is an image...
   INFO - Input file is an image
   INFO - Image seems valid. Try converting to PDF...
   INFO - Successfully converted to PDF, processing...
➜  test-data git:(master) ✗ docker run -v "$(pwd):/home/docker"  ocrmypdf --output-type pdf --pdf-renderer tesseract --force-ocr -l  chi_sim 1-pic-redo.pdf 1-pic-redo11.pdf
   INFO -    1: page already has text! – rasterizing text and running OCR anyway

切换到pdfminer的docker镜像 
root@30c7a47055f5:/tmp/test-data# pdf2txt.py -o 11_redo.html -Y exact 11_redo.pdf 
打开html 查看发现只有手写体  核对者姓名错误了

测试对于编码错误的pdf输入

➜  test-data git:(master) ✗ docker run -v "$(pwd):/home/docker"  ocrmypdf --output-type pdf --pdf-renderer tesseract --force-ocr -l chi_sim 11.pdf 11_redo.pdf
WARNING -    1: page has no images - all vector content will be rasterized at 400 DPI, losing some resolution and likely increasing file size. Use --oversample to adjust the DPI.

11.PDF.zip

wanghaisheng commented 7 years ago

1 xps<-->pdf<--->html 2 利用 爬虫工具提取关键信息

爬虫软件中,国外的http://import.io、Data Scraping Studio、Scrapinghub,国内的集搜客、八爪鱼、火车头等各有什么优缺点 http://portia.readthedocs.io/en/latest/examples.html

docker build -t portia .

wanghaisheng commented 7 years ago

A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents. https://datascience.blog.wzb.eu/2017/

源码 https://github.com/WZBSocialScienceCenter/pdftabextract 说明文档 https://datascience.blog.wzb.eu/2017/02/16/data-mining-ocr-pdfs-using-pdftabextract-to-liberate-tabular-data-from-scanned-documents/

20170705测试

wanghaisheng commented 6 years ago

https://stackoverflow.com/questions/2926159/copypasting-text-from-pdf-results-in-garbage

1 2
  Very often in such cases, where you can't select, copy'n'paste text

from the Acrobat (Reader) window, there is another option which may work nevertheless:

Open 'File' menu, select 'Save as...', select 'Text (normal) (*.txt)', browse to the target directory, type the name you want to use for the text file.

You'll have all text from all pages in the file and need to locate the spot you wanted to copy'n'paste initially -- insofar it is not as comfortable as direct copy'n'paste. But it works more reliably....

It also works with acroread on Linux (but you have to choose 'Save as text...' from the file menu).

Update

You can use the pdffonts command line utility to get a quick-shot analysis of the fonts used by a PDF.

Here is an example output, which demonstrates where a problem for text extraction will very likely occur. It uses one of these hand-coded PDF files from a GitHub-Repository which was created to provide PDF sample files which are well commented and may easily be opened in a text editor:

$ pdffonts textextract-bad2.pdf name type encoding emb sub uni object ID


BAAAAA+Helvetica TrueType WinAnsi yes yes yes 12 0 CAAAAA+Helvetica-Bold TrueType WinAnsi yes yes no 13 0

How to interpret this table?

The above PDF file uses two subsetted fonts (as indicated by the BAAAAA+ and CAAAAA+ prefixes to their names, as well as by the yes entries in the sub column), Helvetica and Helvtica-Bold. Both fonts are of type TrueType. Both fonts use a WinAnsi encoding (a font encoding maps char identifiers used in the PDF source code to glyphs that should be drawn). However, only for font /Helvetica there is a /ToUnicode table available inside the PDF (for /Helvetica-Bold there is none), as indicated by the yes/no in the uni-column).

The /ToUnicode table is required to provide a reverse mapping from character identifiers/codes to characters.

A missing /ToUnicode table for a specific font is almost always a sure indicator that text strings using this font cannot be extracted or copied'n'pasted from the PDF. (Even if a /ToUnicode table is there, text extraction may still pose a problem, because this table may be damaged, incorrect or incomplete -- as seen in many real-world PDF files, and as also demonstrated by a few companion files in the above linked GitHub repository.)

wanghaisheng commented 6 years ago

http://www.thebigdata.cn/JieJueFangAn/30066.html https://www.import.io/ http://www.cogniview.com/ http://www.datawatch.com/our-platform/monarch/

wanghaisheng commented 6 years ago

http://tm.durusau.net/?cat=1480 A Comparison of Two Unsupervised Table Recognition Methods from Digital Scientific Articles 2014 Configurable Table Structure Recognition in Untagged PDF documents 2016 Extracting hierarchical data points and tables from scanned contracts. 2013 Towards domain-independent information extraction from web tables 2007 A methodology for evaluating algorithms for table understanding in PDF documents 2012 Layout-aware text extraction from full-text PDF of scientific articles 2012 Extraction of References Using Layout and Formatting Information from Scientific Articles 2013 Information Extraction and Annotation Systems and Methods for Documents 2013 Ground-Truth and Performance Evaluation for Page Layout Analysis of Born-Digital Documents 2014 Analysis of Documents Born Digital 2014

wanghaisheng commented 6 years ago

XEROX 的Herve Dejean等人 A system for converting PDF documents into structured XML format 2006 Extracting structured data from unstructured document with incomplete resources 2015 https://www.bing.com/academic/profile?id=2164603628&mkt=zh-cn

北大某实验室 Xin Tao, Zhi Tang, Canhui Xu, Liangcai Gao Ground-Truth and Performance Evaluation for Page Layout Analysis of Born-Digital Documents

wanghaisheng commented 6 years ago

With the advancements in information and communication technology, various forms of paper documents are being scanned in order to be interpreted and indexed. The bigger vision however, is to treat paper as a legitimate form of media (like magnetic tapes and optical discs) which can be both machine and human readable. One challenge is that the variety of paper documents being scanned today is much more diverse than what it was several years ago. Many new scripts, more complex, non-Manhattan page layouts and various font styles are making this vision challenging. Furthermore, a much larger percentage of handwritten material is being acquired which does not adhere to traditional layout constraints. Character recognition as well as various established pre-processing modules such as noise removal, layout analysis and zone classification are affected by this increased complexity.

The process of identifying structures of a document image can be based on the physical (process of dividing the document into physical homogeneous zones) or logical (process of assigning logical roles and relations to detected zones) layout. Page segmentation algorithms fall into the category of physical layout analysis. They perform segmentation of a document page into homogeneous zones, each consisting of only one physical layout structure such as text, graphics, equations, logos, stamps. Physical layout analysis can be pixel based or texture based segmentation, but here the goal is that the final result is a region segmentation. In texture-based segmentation, isolated points or small areas could be classified as zonal objects disregarding the connectivity aspect of an object. In contrast, the work is concerned with non overlapping geometric zones where document components are separated by white space. Such connected component based approaches use macro level content information, and can be further classified into Manhattan and non-Manhattan layouts. https://lampsrv02.umiacs.umd.edu/projdb/project.php?id=57