openzim / openedx

Open edX (to zim) scraper
GNU General Public License v3.0
8 stars 7 forks source link

Invalid internal links #105

Closed rgaudin closed 4 years ago

rgaudin commented 4 years ago

zimcheck reports many invalid internal links using the latest phzh ZIM

[ERROR] Invalid internal links found:
  The following links:
- ../../I/9a122b295d484793bbf1a33ab0217a69/digitallearning@phzh.ch?Subject=Feedback%20CORE%20English%20#01,%20v1:PHZH+W-IB+2019_E_1
(I/9a122b295d484793bbf1a33ab0217a69/digitallearning@phzh.ch) were not found in article A/9a122b295d484793bbf1a33ab0217a69/index.html
  The following links:
- ../../../../../../-/instance_assets/noreferrer.aa62a3e70ffa.js
(-/instance_assets/noreferrer.aa62a3e70ffa.js) were not found in article A/course/core-english-01/topic-5-my-feelings-and-myself/intro/intro-and-objectives/index.html
  The following links:
- ../../../../../../-/instance_assets/noreferrer.aa62a3e70ffa.js
(-/instance_assets/noreferrer.aa62a3e70ffa.js) were not found in article A/course/core-english-01/topic-5-my-feelings-and-myself/unit-1-dealing-with-different-emotions/goal/index.html
  The following links:
- ../../../../../../-/instance_assets/noreferrer.aa62a3e70ffa.js
(-/instance_assets/noreferrer.aa62a3e70ffa.js) were not found in article A/course/core-english-01/topic-5-my-feelings-and-myself/unit-1-dealing-with-different-emotions/self-evaluation/index.html
  The following links:
- ../../../../../../-/instance_assets/jquery.autocomplete.3bd10d7510d2.js
(-/instance_assets/jquery.autocomplete.3bd10d7510d2.js) were not found in article A/course/core-english-01/topic-5-my-feelings-and-myself/unit-1-dealing-with-different-emotions/step-1/index.html
  The following links:
- <%= largeSRC %>
(A/course/core-english-01/topic-5-my-feelings-and-myself/unit-1-dealing-with-different-emotions/step-2/</largeSRC /) were not found in article A/course/core-english-01/topic-5-my-feelings-and-myself/unit-1-dealing-with-different-emotions/step-2/index.html
satyamtg commented 4 years ago

This seems to be due to the current behaviour of the download function, which doesn't really return whether download was successful/unsuccessful. This leads the other functions using it to rewrite links even if download was not successful (there are some files which cannot be downloaded as they either give 404s or some other errors). This issue shall be fixed by returning the download status and then using that to decide whether a link to an asset will be written (in case of success) or made empty (in case of failure)

rgaudin commented 4 years ago

👍