kiwix / kiwix-js

Fully portable & lightweight ZIM reader in Javascript
https://www.kiwix.org/
GNU General Public License v3.0
310 stars 135 forks source link

Image Search #277

Open sharun-s opened 7 years ago

sharun-s commented 7 years ago

The fast image loader is almost done, but rather than fixing some of the remaining bugs this weekend I got a bit carried away :-) I realized going from image loading to image searching wasn't such a big leap. Here's the first go at it.
imagesearch

This ofcourse uncovered more small bugs in the image load process and introduces new issues of how to layout images with varying heights and width, how to decide results returned etc. Decided to post an issue to get feedback on what you guys would like to see out of image search? The plan right now is to work out the remaining image load issues and then maybe work on this.

sharun-s commented 7 years ago

And since it's Wimbledon time have to post one more pic fed

Jaifroid commented 7 years ago

Sorry, @sharun-s , just catching up. This looks great! Will try it out.

Jaifroid commented 7 years ago

Hi, @sharun-s , have you committed this image search anywhere? Your dev branch on your fork hasn't been updated in 13 days (with a commit) as far as I can tell, and I can't see a relevant branch on this git. I'll happily test the code if you tell me where to find it, as it looks like a unique enhancement, and I think goes beyond what is available Kiwix desktop, at least since I last looked at it.

sharun-s commented 7 years ago

@Jaifroid just pushed my local dump It's a big hackjob merging text search and the display article image loader. Instead of displaying search results, it retrieves the articlehtml of each result. extracts the images and injects them into a imageresults page.

Tons of issues!! And the code is a mess. Mostly because there was no real planning. I just wanted to see if it would work. The more you play with it the more you realize there are so many different ways to do it. So looking forward to hearing your thoughts...

sharun-s commented 7 years ago

@Jaifroid Should have mentioned just type a term into the searchbox and click the camera button. Ignore the text results that show up. They will go away.

Jaifroid commented 7 years ago

I've just been trying this out, @sharun-s -- it's good and impressively fast image retrieval on Edge (PC)! Makes a great new feature, and puts us ahead of the main Kiwix app, at least the desktop executable.

Two thoughts to make this even more useful:

In any case, I think this is a really good enhancement, and one worth pursuing and refining! Congratulations. (I'll test on mobile next and get back to you.)

Jaifroid commented 7 years ago

I've now tested on Edge Mobile, and it all works fine running in the browser from the filesystem, and image retrieval is fast even for a large grid of images. Running in the browser from the filesystem is a relatively good proxy for encapsulating the code within a UWP container app, since it is the same html engine. So I think your code would be usable, once you've tidied it, in kiwix-js-windows, pending memory usage investigation in an app environment.

sharun-s commented 7 years ago

A quick regex of the current data-src tag would do the job

hehe you just see regex's everywhere don't you? Part of the master plan to rename the app regix I am sure! Good idea though. I really haven't spent much time on the UI. I noticed many of the images in the articles have nice captions under them so another option would be to yank that div and inject it in. Also all Wikipedia image links have lot of info in their title and meta tags and maybe this is being stored somewhere in the ZIM.

btw the images are clickable. A click will take you to the the article where the image was found.

On "more flexible searching" Here is the matching condition in the code -> link As you can see (and have inferred), it does an exact match, from the beginning of the title of the dirEntry found with the search keyword. Which is why I guess they named it prefix. But you can change this condition to whatever you like to get all kinds of behaviour.

Also to understand searching play around with text search and look at the results. Try cat and then Cat. You will see a different ordering. This stuff effects what gets retrieved. Look at the devtools console for the word "added" and it will get clearer which articles are getting used (based on the matching condition). Another factor is DirEntries might be redirects (unhandled in the current code) which must be followed to retrieve the target article. "federer" for example will redirect to Roger Federer but since I don't follow the redirect images aren't retrieved.

kelson42 commented 5 years ago

@sharun-s Would you still work to conclude on that? We have created a tiled homepage for selections in MWoffliner. You can see a demo here http://library.kiwix.org/wikipedia_es_football_nopic_2019-05/A/index. AFAIK this is pure javascrip done using library https://masonry.desandro.com/. I think this could be reproduced her quite easily.

sharun-s commented 5 years ago

@kelson42 That looks really good. And probably all of it is reusable. My branch is way way behind...and also looks like something broke it with latest Firefox. Will take a while. I have a feeling I actually used masonry. Will check when I get some time.