Open niryariv opened 10 years ago
verified the bug.
This happens because of malformed parsing of the XML files:
Some articles are divided into several sections, meaning the article image is divided into several sections. For example "http://opa.org.il/api/v1/?query=%D7%9E%D7%99%D7%A9%D7%A7%D7%99%20%D7%99%D7%A8%D7%99%D7%91" - The link id is: http://www.jpress.nli.org.il/Olive/APA/NLI_heb/get/GetImage.ashx?kind=block&href=DAV/1980/7/1&id=Ar03604&ext=.png But "id=Ar03604" should have been "id=Ar0360401".
_We need to note that there are several possible frames inside each article - should consider this _
To reproduce:
2, each results contain an image url, eg: