Open tfuji opened 2 months ago
@mitsuhashi @skwsm 以下で全件の確認ができます。修正よろしくお願いします。 https://github.com/dbcls/humandbs/tree/dev?tab=readme-ov-file#users-controlled-access-data
@tfuji @skwsm お疲れ様です。
スクレイピング結果のJSONを見ると、スクリプトがCountry/Region列を想定していないように見えます。その右側では列名と値の対応がずれているようです。
dbcls3284:json_from_joomla mitsuhashi$ git branch
import_json
* import_json_skwsm
main
dbcls3284:json_from_joomla mitsuhashi$ grep 'Country' humandb_20231223_both.json | head -10
Joomla!のhtmlを確認しましたが、列名と値の対応に問題はないと思います。
https://humandbs.dbcls.jp/en/hum0355-v1
<p> </p>
<h1><span style="text-decoration: underline; font-family: helvetica; font-size: 15pt;"><strong>USRES (Controlled-access Data)</strong></span></h1>
<table class="table-style style-greystripes" style="width: 922px; height: 70px;">
<thead>
<tr><th align="center"><span style="font-family: helvetica; font-size: 11pt;">Principal Investigator</span></th><th align="center"><span style="font-family: helvetica; font-size: 11pt;">Affiliation</span></th><th align="center"><span style="font-family: helvetica; font-size: 11pt;">Country/Region</span></th><th align="center"><span style="font-family: helvetica; font-size: 11pt;">Research Title</span></th><th align="center"><span style="font-family: helvetica; font-size: 11pt;">Data in Use (Dataset ID)</span></th><th align="center"><span style="font-family: helvetica; font-size: 11pt;">Period of Data Use</span></th></tr>
</thead>
<tbody>
<tr>
<td><span style="font-family: helvetica; font-size: 10pt; line-height: normal;">Maher Eamonn</span></td>
<td><span style="font-family: helvetica; font-size: 10pt; line-height: normal;">University of Cambridge</span></td>
<td><span style="font-family: helvetica; font-size: 10pt; line-height: normal;">United Kingdom of Great Britain and Northern Ireland</span></td>
<td><span style="font-family: helvetica; font-size: 10pt; line-height: normal;">Molecular Pathology of Human Genetic Disease</span></td>
<td><span style="font-family: helvetica; font-size: 10pt; line-height: normal;">JGAD000663</span></td>
<td><span style="font-family: helvetica; font-size: 10pt; line-height: normal;">2023/03/19-2024/07/20</span></td>
</tr>
</tbody>
</table>
https://humandbs.dbcls.jp/en/hum0327-v1
<p> </p>
<h1><span style="text-decoration: underline; font-family: helvetica; font-size: 15pt;"><strong>USRES (Controlled-access Data)</strong></span></h1>
<table class="table-style style-greystripes" style="width: 922px; height: 70px;">
<thead>
<tr><th align="center"><span style="font-family: helvetica; font-size: 11pt;">Principal Investigator</span></th><th align="center"><span style="font-family: helvetica; font-size: 11pt;">Affiliation</span></th><th align="center"><span style="font-family: helvetica; font-size: 11pt;">Country/Region</span></th><th align="center"><span style="font-family: helvetica; font-size: 11pt;">Research Title</span></th><th align="center"><span style="font-family: helvetica; font-size: 11pt;">Data in Use (Dataset ID)</span></th><th align="center"><span style="font-family: helvetica; font-size: 11pt;">Period of Data Use</span></th></tr>
</thead>
<tbody>
<tr>
<td><span style="font-family: helvetica; font-size: 10pt; line-height: normal;">Michiaki Hamada</span></td>
<td><span style="font-family: helvetica; font-size: 10pt; line-height: normal;">Faculty of Science and Engineering, Waseda University</span></td>
<td><span style="font-family: helvetica; font-size: 10pt; line-height: normal;">Japan</span></td>
<td><span style="font-family: helvetica; font-size: 10pt; line-height: normal;">Construction of RNA-targeted Drug Discovery Database</span></td>
<td><span style="font-family: helvetica; font-size: 10pt; line-height: normal;">JGAD000624</span></td>
<td><span style="font-family: helvetica; font-size: 10pt; line-height: normal;">2022/12/26-2025/03/31</span></td>
</tr>
</tbody>
</table>
https://humandbs.dbcls.jp/en/hum0320-v1
<p> </p>
<h1><span style="text-decoration: underline; font-family: helvetica; font-size: 15pt;"><strong>USRES (Controlled-access Data)</strong></span></h1>
<table class="table-style style-greystripes" style="width: 922px; height: 70px;">
<thead>
<tr><th align="center"><span style="font-family: helvetica; font-size: 11pt;">Principal Investigator</span></th><th align="center"><span style="font-family: helvetica; font-size: 11pt;">Affiliation</span></th><th align="center"><span style="font-family: helvetica; font-size: 11pt;">Country/Region</span></th><th align="center"><span style="font-family: helvetica; font-size: 11pt;">Research Title</span></th><th align="center"><span style="font-family: helvetica; font-size: 11pt;">Data in Use (Dataset ID)</span></th><th align="center"><span style="font-family: helvetica; font-size: 11pt;">Period of Data Use</span></th></tr>
</thead>
<tbody>
<tr>
<td><span style="font-family: helvetica; font-size: 10pt; line-height: normal;">Ansuman Satpathy</span></td>
<td><span style="font-family: helvetica; font-size: 10pt; line-height: normal;">Department of Pathology, Stanford University</span></td>
<td><span style="font-family: helvetica; font-size: 10pt; line-height: normal;">United States of America</span></td>
<td><span style="font-family: helvetica; font-size: 10pt; line-height: normal;">Epigenetics of Inflammatory Skin Disorders</span></td>
<td><span style="font-family: helvetica; font-size: 10pt; line-height: normal;">JGAD000597</span></td>
<td><span style="font-family: helvetica; font-size: 10pt; line-height: normal;">2022/07/04-2023/05/31</span></td>
</tr>
</tbody>
</table>
json_from_joomla/humandb_20231223_both.json "Period of Data Use"に"Data in Use (Dataset ID)" の値が含まれている