pishoyg / coptic

This is a project that aims to make the Coptic language more learnable.
https://remnqymi.com/
GNU General Public License v3.0
7 stars 0 forks source link

[Site/Search] Reduce Technical Debt, and Parameterize, the Pipeline #230

Open pishoyg opened 2 months ago

pishoyg commented 2 months ago

TODO:

pishoyg commented 2 months ago

Status:

Done:

No longer relevant:

TODO:

pishoyg commented 2 months ago

Status:

Done:

TODO:

pishoyg commented 2 months ago

For HTML reduction: https://stackoverflow.com/questions/1765848/remove-a-tag-using-beautifulsoup-but-keep-its-contents

pishoyg commented 1 month ago

NOTE: We started experiencing some performance degradation, now that the size of our search index is 24MB. The page takes a few moments to load, and you can now feel it. While this happens only occasionally (because the file gets cached by the browser afterwards), we still shouldn't be wasteful and should keep the data optimal whenever possible.

pishoyg commented 1 month ago

Status:

Done:

Abandoned:

TODO:

Parameterization:

Optimization:

pishoyg commented 1 month ago

I have an idea for HTML reduction. Let's add a retain_classes parameter to the selector. When we extract text from a tag, let's retain the elements belonging to classes in that list in raw format, and let's extract the text from the other elements.

pishoyg commented 1 month ago

Let's get rid of the raw parameter?

This whole thing is complicated and needs some brainstorming!

pishoyg commented 1 month ago

Update on this optimization suggestion: Initial experimentation shows that it's actually working great! We will adopt it!

pishoyg commented 1 month ago

Status:

Done:

TODO:

pishoyg commented 1 month ago

Xooxle has become our flagship product, so we prioritize the improvements.

pishoyg commented 2 weeks ago

Status:

No longer relevant:

DONE:

TODO:

pishoyg commented 2 weeks ago

TODO: Document the simplicity of the HTML structure supported by Xooxle. Currently, you only allow the following tags:

pishoyg commented 1 week ago

Status:

No longer wanted:

DONE:

TODO:

pishoyg commented 1 week ago

Status:

TODO:

pishoyg commented 1 week ago

Reopened #234.