Closed wohenbushuang closed 1 year ago
给个 test case?更能说明这样修改的必要性。
@zepinglee 已编辑
我的意思是在这里添加 test case。在 Zotero 的 translator editor 比较方便添加,也方便以后修改时测试。
另外会不会有一人有多个隶属单位的情况,比如“张三1,2”?
我这边editor里的test run总是输出空白,run and updated 后都清空了……
{
"type": "web",
"url": "https://d.wanfangdata.com.cn/periodical/ysxb98202209004",
"items": [
{
"itemType": "journalArticle",
"title": "万方数据知识服务平台",
"creators": [],
"language": "zh-CN",
"libraryCatalog": "Wanfang Data",
"url": "https://d.wanfangdata.com.cn/periodical/ysxb98202209004",
"attachments": [],
"tags": [],
"notes": [],
"seeAlso": []
}
]
}
比如我tag scrape后打印出来是
20:22:03 Returned item:
...
"tags": [
{
"tag": "关键词:"
}
{
"tag": "古近系"
}
{
"tag": "Sokor1组"
}
{
"tag": "储层物性"
}
{
"tag": "影响因素"
}
{
"tag": "Termit盆地"
}
]
(这个"关键词:"的tag好像是个新的bug啊……
test run的结果是
20:26:24 Translation successful
20:26:24 TranslatorTester: Data mismatch detected:
20:26:24 {
"itemType": "journalArticle"
"creators": []
"attachments": []
"tags": [
- {
- "tag": "Sokor1组"
- }
- {
- "tag": "Termit盆地"
- }
- {
- "tag": "储层物性"
- }
- {
- "tag": "关键词:"
- }
- {
- "tag": "古近系"
- }
- {
- "tag": "影响因素"
- }
]
"notes": []
"seeAlso": []
"title": "万方数据知识服务平台"
"language": "zh-CN"
"libraryCatalog": "Wanfang Data"
"url": "https://d.wanfangdata.com.cn/periodical/ysxb98202209004"
}
20:26:24 TranslatorTester: Wanfang Data Test 1: unknown (Item 0 does not match)
另外会不会有一人有多个隶属单位的情况,比如“张三1,2”?
目前没见到过这样的情况,如果有了再请发现的人提供下样例地址吧
我先把代码合并了,测试用例,我后面手动添加一下。
我这边editor里的test run总是输出空白,run and updated 后都清空了……
{ "type": "web", "url": "https://d.wanfangdata.com.cn/periodical/ysxb98202209004", "items": [ { "itemType": "journalArticle", "title": "万方数据知识服务平台", "creators": [], "language": "zh-CN", "libraryCatalog": "Wanfang Data", "url": "https://d.wanfangdata.com.cn/periodical/ysxb98202209004", "attachments": [], "tags": [], "notes": [], "seeAlso": [] } ] }
我试了一下也是这样,似乎万方用了什么技术导致在 scaffold 不能直接抓取信息。这样调试起来就很麻烦。在不过在浏览器端抓取是正常的。
万方这里使用了加密,网页显示正常。直接在test里run会报错。
eg. https://d.wanfangdata.com.cn/periodical/ysxb98202209004
Before:
After: