New Problem - Githubissues

@jsvine I have a new situation. I solved the previous problem by adjusting y_tolerance, but I encountered the following problem with the new document. The document format is as follows:

These sorts of documents are in left-right columns, and when I used extract_text() noncustom y_tolerace, I was getting text that was in error lines, and some of the text looked like this: ` 第一章财务管理概论　使用会计云课堂或微信扫码快速做题对答案看解析 “ ” App “ ” 、、、掌握解题思路开启轻松过关之旅 , 。有利于企业资源的合理配置一、单项选择题 C. 反映创造利润与投入资本之间的关系 D.

下列各项财务管理环节中与奖惩紧密 , 5. 下列各项措施中不能协调股东和经营联系是贯彻责任制原则的要求也是 , , , 者的利益冲突的是构建激励与约束机制的关键环节的是 (　　 )。通过市场约束经营者 A. (　　 )。通过债权人约束经营者财务决策财务控制 B. A. B. 给予经营者一定的股票期权财务分析财务评价 C. C. D. 解聘经营者 D. ` When I adjust the parameter y_tolerace=7, the extracted text is as follows:

In other words, when I magnify the y_tolerace value appropriately, I will solve some of the problem of dividing the text into two lines when it should be the same line, but at the same time I will extract the text from different lines into the same line, which will cause the content to be confused. I have tried to adjust y_tolerace to different values, but the situation has not been resolved. I would like to know how to solve this problem to get the text in basically the same format as the document. I strongly hope you can reply, thank you!

jsvine / pdfplumber

New Problem #745