openzim / zim-requests

Want a new ZIM file? Propose ZIM content improvements or fixes? Here you are!
https://farm.openzim.org
42 stars 3 forks source link

Medline plus in English #991

Closed Popolechien closed 1 month ago

Popolechien commented 6 months ago

See also #992

Popolechien commented 6 months ago

Notes : Only the medical encyclopedia part is being targeted. https://medlineplus.gov/ency/ redirects to https://medlineplus.gov/encyclopedia.html so I'm not sure how that will influence things

Popolechien commented 6 months ago

If the scrape does not work then https://iiab.me/modules/en-medline_plus/ is a possible alternative.

MrnateGeek commented 6 months ago

I got the icon in 5 seconds https://medlineplus.gov/images/touch-icon.png

kelson42 commented 6 months ago

@Popolechien Why only the encyclopedia part? For example https://medlineplus.gov/healthtopics.html seems of interest as well?

Popolechien commented 6 months ago

@kelson42 that would be #994

benoit74 commented 1 month ago

Recipe created at https://farm.openzim.org/recipes/medlineplus.gov_en_ency ; limit set to only 100 pages for now to check website behavior (custom CSS will be needed anyway)

benoit74 commented 1 month ago

Discussed atm with Popolechien, we will ZIM the whole website, we do not see real arguments for ZIMing only the encyclopedia

benoit74 commented 1 month ago

Custom CSS developed and recipe reconfigured.

Launching with only 1000 pages for now.

Will probably have an issue with pages with videos like https://medlineplus.gov/ency/anatomyvideos/000002.htm (will probably clob the ZIM but not work at all, same issue as https://github.com/openzim/zim-requests/issues/323: https://github.com/openzim/zimit/issues/353)

benoit74 commented 1 month ago

Recipe moved to https://farm.openzim.org/recipes/medlineplus.gov_en_all/

benoit74 commented 1 month ago

No issue found with video, file is simply an mp4 and everything is working well on test page mentioned above.

Just started the full ZIM creation in dev.

benoit74 commented 1 month ago

File is ready in dev: https://dev.library.kiwix.org/#lang=eng&q=medline, looks ok to me.

Please review as well before moving to prod

Popolechien commented 1 month ago

LGTM.

On a side note, the huge number of external links makes it all the more important to look into openzim/zimit/issues/374 and decide on blocking them or at a very minimum flag them.

benoit74 commented 1 month ago

Requested in production, will update when file is ready.

benoit74 commented 1 month ago

File is ready in production.

Note that on library.kiwix.org it is impacted (and Medline ES too) by https://github.com/kiwix/operations/issues/280 but ZIM is OK, this is only a problem with our infra / kiwix-serve instance, not the ZIM itself, so I close the issue.

ZIM is playing fine in dev, with kiwix apple reader, and would probably play fine on a hotspot.