learningequality / kolibri-library

A repository for tracking issues related to the Kolibri Library
MIT License
0 stars 0 forks source link

Build Channel for Funda Wande #13

Closed jredrejo closed 4 months ago

jredrejo commented 6 months ago

Overview

Build public channel for https://fundawande.org/learning-resources

Description and outcomes

Content is in pdf format

License: CC-BY (page 3 of the workbooks) Copyright owner: Funda Wande

jredrejo commented 6 months ago

Done, in review, https://studio.learningequality.org/en/channels/77da46f34a335d0dbdbd6789156f38bd @rtibbles https://github.com/learningequality/sushi-chef-funda-wande-english contains the cheffing code

radinamatic commented 6 months ago

I updated the channel to the most recent version where the note stated that 2 corrupted PDF files have been replaced, and it did fix one that I noted in my first test, but I can still see at least 2 others:

fundawande fundawande2
jredrejo commented 6 months ago

@radinamatic thank you, Funda Wande site was really slow and some files seem to have been downloaded with problems. I've detected some but I haven't opened all of the 118 pdf files to check all of them. I've re-downloaded these two you've found and created a new version of the channel. Please, let me know if you find any other.

radinamatic commented 6 months ago

So I downloaded the full channel, but to expedite checking if any of the PDF files are corrupted (I also did not have the patience to go through 100+) I used another application that found other 6 (they are marked corrupted and maybe corrupted in the attached report.

CorruptedPDFinder_results.txt

corrupted-PDFs

I confirmed that they cannot be opened in Firefox, so the report is probably accurate. Do you have the means of figuring out which files are those from the Studio resource file name? I can't think of the way to do it myself in Kolibri... 🤔

There are also other commands you could use that I would have tried had I not decided to test this channel in Windows... 🤦🏽‍♀️ Could these be run after you download the files from the source site, and prior to uploading them to Studio?

Comes to mind that it would be an useful addition to the chef workflow in any case, checking that downloaded PDF files are not corrupted (and maybe re-trying the download), does the ricecooker has something to that effect, @rtibbles?

radinamatic commented 6 months ago

Some other points for improvements:

  1. There are 2 folders named Reading Academy, which is confusing. One contains videos, and the other PDF files. Given that those two types of resources are presented together in the rest of the folders, can we be fully consistent here and have one Reading Academy folder with both?

    fundawande3
  2. I can understand why we may not want to create deep nested subfolders for Term 1, Term 2 etc. inside Literacy Workbooks and Teaching Guides, but instead prepend the respective term to the resource name. Could we use the dash - instead of the slash / to do that? I believe it would improve the readability.

  3. The order of the resources inside the Reading for Meaning Course folder is a bit scattered. After the modules the resources are numbered, but start with 55, then 2, 14, 23, 139, 122... And it ends with 234, 1, 7, 142.

    fundawande4

    I understand not all may be available in English, but could we at least have them in the proper ascending order, even if not fully sequential?

jredrejo commented 6 months ago

So I downloaded the full channel, but to expedite checking if any of the PDF files are corrupted (I also did not have the patience to go through 100+) I used another application that found other 6 (they are marked corrupted and maybe corrupted in the attached report.

CorruptedPDFinder_results.txt

corrupted-PDFs

I confirmed that they cannot be opened in Firefox, so the report is probably accurate. Do you have the means of figuring out which files are those from the Studio resource file name? I can't think of the way to do it myself in Kolibri... 🤔

There are also other commands you could use that I would have tried had I not decided to test this channel in Windows... 🤦🏽‍♀️ Could these be run after you download the files from the source site, and prior to uploading them to Studio?

Comes to mind that it would be an useful addition to the chef workflow in any case, checking that downloaded PDF files are not corrupted (and maybe re-trying the download), does the ricecooker has something to that effect, @rtibbles?

Thank you, after checking them, it was only 3 of them that were corrupted. Also, for the ricecooker check, I think that's a good idea, could you open an issue in ricecooker for it? if not, I'll fill it

jredrejo commented 6 months ago

Some other points for improvements:

  1. There are 2 folders named Reading Academy, which is confusing. One contains videos, and the other PDF files. Given that those two types of resources are presented together in the rest of the folders, can we be fully consistent here and have one Reading Academy folder with both? fundawande3

There was some trailing spaces in the name of the topic coming from errors in the original page. Fixed

  1. I can understand why we may not want to create deep nested subfolders for Term 1, Term 2 etc. inside Literacy Workbooks and Teaching Guides, but instead prepend the respective term to the resource name. Could we use the dash - instead of the slash / to do that? I believe it would improve the readability.

Actually, I don't know why dash is better than slash, but as I don't have any opinion on it, I've changed it.

  1. The order of the resources inside the Reading for Meaning Course folder is a bit scattered. After the modules the resources are numbered, but start with 55, then 2, 14, 23, 139, 122... And it ends with 234, 1, 7, 142. fundawande4

    I understand not all may be available in English, but could we at least have them in the proper ascending order, even if not fully sequential?

In fact, they were sorted by the module, but module was not visible in the names, so I've added it.

radinamatic commented 6 months ago

Thank you, after checking them, it was only 3 of them that were corrupted.

I updated the channel locally, but there still 6 PDF files that report as corrupted in all browsers (Firefox, Chrome and Edge). CorruptedPDFinder_results.txt

One weird thing happened during the update: it looked like it was performed in 2 separate tasks, one of whom failed, but upon another check to import more, channel was apparently fully downloaded with all resources on device 🤷🏽‍♀️

task check
2024-03-04_20-03-11 2024-03-04_22-30-22

Also, for the ricecooker check, I think that's a good idea, could you open an issue in ricecooker for it? if not, I'll fill it

Done! 🙂

radinamatic commented 6 months ago

Some other points for improvements:

  1. There are 2 folders named Reading Academy, which is confusing. One contains videos, and the other PDF files. Given that those two types of resources are presented together in the rest of the folders, can we be fully consistent here and have one Reading Academy folder with both?

There was some trailing spaces in the name of the topic coming from errors in the original page. Fixed

Excellent! ✔️

  1. I can understand why we may not want to create deep nested subfolders for Term 1, Term 2 etc. inside Literacy Workbooks and Teaching Guides, but instead prepend the respective term to the resource name. Could we use the dash - instead of the slash / to do that? I believe it would improve the readability.

Actually, I don't know why dash is better than slash, but as I don't have any opinion on it, I've changed it.

I don't have hard empirical data on this, but suspect that all but nerdy computer people would find separating with slashes easier to read 😛 Thank you for changing that! 🙏🏽

  1. The order of the resources inside the Reading for Meaning Course folder is a bit scattered. After the modules the resources are numbered, but start with 55, then 2, 14, 23, 139, 122... And it ends with 234, 1, 7, 142.

In fact, they were sorted by the module, but module was not visible in the names, so I've added it.

Ok, thank you, looks less unordered now! 👍🏽

radinamatic commented 5 months ago

I re-checked all the files that kept being reported as corrupted, and maybe corrupted, and were able to open them through Kolibri, so the content checks out, good work! 👏🏽 💯 :shipit:

jredrejo commented 5 months ago

@rtibbles waiting for your technical review after the good to go passes from @radinamatic & @revanthvle

rtibbles commented 4 months ago

This is complete - technical pieces are fine.