Closed hjing100 closed 2 years ago
tyr掉
------------------ Original ------------------ From: hjing100 @.> Date: 周三,1月 26,2022 15:30 To: ownthink/KnowledgeGraphData @.> Cc: Subscribed @.***> Subject: Re: [ownthink/KnowledgeGraphData] _csv.Error: line contains NULL byte (Issue #28)
with open('ownthink_v2.csv', 'r', encoding='utf8') as fin: reader = csv.reader(fin) for index, read in enumerate(reader):
你好,我在运行以上读取代码时,在中间某一行报错_csv.Error
— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you are subscribed to this thread.Message ID: @.***>
是for index, read in enumerate(reader)这一行报错,用try好像不能继续后边的for循环了? 我希望是continue的效果
我记得是while true的写法,然后一行一行读取,对一行读取报错的try掉。可以搜索下,时间久了不记得在哪里了。
------------------ 原始邮件 ------------------ 发件人: "ownthink/KnowledgeGraphData" @.>; 发送时间: 2022年1月26日(星期三) 下午4:43 @.>; @.**@.>; 主题: Re: [ownthink/KnowledgeGraphData] _csv.Error: line contains NULL byte (Issue #28)
是for index, read in enumerate(reader)这一行报错,用try好像不能继续后边的for循环了? 我希望是continue的效果
— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you commented.Message ID: @.***>
好的 谢谢 我查查看
我记得是
while true
的写法,然后一行一行读取,对一行读取报错的try
掉。可以搜索下,时间久了不记得在哪里了
大概是在 9929226 行有一个 \x00
空字节:
>>> with open('D:/temp/ownthink_v2/ownthink_v2.csv', 'r') as f:
... i = 1
... while i < 9929228:
... l = f.readline()
... if i > 9929225:
... l
... i += 1
...
'杂剧石棺,"出土\x00时间",1978年\n'
'杂剧石棺,现存地,河南博物院\n'
我这边是这样解决的, 不知道是否有帮助:
import zipfile
from io import TextIOWrapper
import csv
# https://stackoverflow.com/questions/26942476/reading-csv-zipped-files-in-python
# https://stackoverflow.com/questions/50259792/reading-csv-files-from-zip-archive-with-python-3-x
zippedFileName = 'C:/Users/Henry/Downloads/ownthink_v2.zip'
pwd = 'https://www.ownthink.com/'
with zipfile.ZipFile(zippedFileName) as archive:
with archive.open('ownthink_v2.csv', pwd=bytes(pwd, encoding='utf-8')) as f:
with TextIOWrapper(f, 'utf-8') as wrappedF:
reader = csv.reader(wrappedF)
linesToRead = 10
while linesToRead > 0:
try:
row = reader.__next__()
print(row)
except StopIteration:
break
except:
print(f'Error at line: {reader.line_num}')
finally:
linesToRead -= 1
你好,我在运行以上读取代码时,在中间某一行报错_csv.Error