AllMangasReader-dev / AMR

AllMangasReader developer branch
www.allmangasreader.com
Other
98 stars 54 forks source link

Stop using localStorage in favor of the IndexedDB API #7

Open braiam opened 11 years ago

braiam commented 11 years ago

We should stop storing data in localStorage in favor of using fastest methods like the IndexedDB api. As described here http://www.html5rocks.com/en/tutorials/indexeddb/todo/ and https://developer.mozilla.org/en-US/docs/IndexedDB/Using_IndexedDB_in_chrome

Also, take into account that indexeddb is for all browsers/devices since it is the standard method. So, if someone just wants to implement AMR for Firefox they can do it!

bessx commented 11 years ago

Any suggestions for architecture of the db?

mexicano21 commented 11 years ago

I read again and again and never understood IndexedDB. But I know that with WebSQL abandoned we don't have another option.

BTW, I still need to prepare my workspace (a canary Chrome, the latest AMR and such), I hope I can do it soon and join all the fun.

2013/3/1 Someone516 notifications@github.com

Any suggestions for architecture of the db?

— Reply to this email directly or view it on GitHubhttps://github.com/AllMangasReader-dev/AMR/issues/7#issuecomment-14319600 .

Fábio de Godoy http://www.animenewsnetwork.com/MyManga/?user=mexicano21

braiam commented 11 years ago

Created branch feature/indexedDB for this function. Start committing here and remember that it must be backwards compatible. Use git checkout feature/indexedDB or git flow feature checkout indexedDB to move your branch here :smile:

bessx commented 11 years ago

https://developer.chrome.com/trunk/apps/syncFileSystem.html

Just leaving this here... (though this is for apps, I think it will work for exts too)

bessx commented 11 years ago

https://github.com/GoogleChrome/chrome-app-samples/tree/master/syncfs-editor

This too...

KriPet commented 11 years ago

Hello everyone. I don't usually program in javascript, but I really like this project and would like to help.

This is the thing I'd like to see most in AMR (because of the syncing), but I have some questions.

In the current LocalStorage, all the mangas are stored in a single object. The way I understand the storage.sync API, one item can maximum be 4096 (including key length). My manga json string is already 24 kB.

Do you guys have any plans on how to solve this? As far as I can see, there are three solutions:

  1. Use multiple objects and split/merge them when needed (mangas1, mangas2, mangas3 etc.) We'd need to find a way to ensure that the lists aren't getting too long, and find a way to handle deleting mangas. (Or just rebuild all the objects every time we sync.
  2. Give each manga it's own item. This is limited at around 500. In addition to the extension settings and bookmarks, we'd also need to store all the key names. A good thing with this method is that when syncing, you wouldn't have to rewrite the entire list of mangas, only the ones that have changed.)
  3. Compress the data. Personally, I like having human readable data, so I'd like to avoid this.

None of these solutions are ideal, but I'd like to hear your input. Or, if you already know how to do this, please let me know.

Oh, and I have never used Git before, or GitHub, (Only a bit mercurial) sorry if this is the wrong place to talk about these things.

EDIT: Sorry, just read up some more, and I mistook IndexedDB for being the chrome.storage.sync. Still, is the current plan to store the entire indexedDB as one item in storage.sync? Will it fit?

mexicano21 commented 11 years ago

If I remember correctly, the thing that uses more space by far is the manga list of mirrors which support listing all their mangas. This is useful for quick search, as there are no need to connect to the website when the user tries to find a manga. It is important to know too that most mirrors that AMR stores all its mangas have only a few mangas, then they don't affect much, but there are one or another that are huge manga websites and all his mangas are stored by AMR, and those are the culprits for most of space that AMR takes.

(Also, sorry for my bad english)

2013/4/24 Peter K. notifications@github.com

Hello everyone. I don't usually program in javascript, but I really like this project and would like to help.

This is the thing I'd like to see most in AMR (because of the syncing), but I have some questions.

In the current LocalStorage, all the mangas are stored in a single object. The way I understand the storage.sync API, one item can maximum be 4096 (including key length). My manga json string is already 24 kB.

Do you guys have any plans on how to solve this? As far as I can see, there are three solutions:

  1. Use multiple objects and split/merge them when needed (mangas1, mangas2, mangas3 etc.) We'd need to find a way to ensure that the lists aren't getting too long, and find a way to handle deleting mangas. (Or just rebuild all the objects every time we sync.
  2. Give each manga it's own item. This is limited at around 500. In addition to the extension settings and bookmarks, we'd also need to store all the key names. A good thing with this method is that when syncing, you wouldn't have to rewrite the entire list of mangas, only the ones that have changed.)
  3. Compress the data. Personally, I like having human readable data, so I'd like to avoid this.

None of these solutions are ideal, but I'd like to hear your input. Or, if you already know how to do this, please let me know.

Oh, and I have never used Git before, or GitHub, (Only a bit mercurial) sorry if this is the wrong place to talk about these things.

— Reply to this email directly or view it on GitHubhttps://github.com/AllMangasReader-dev/AMR/issues/7#issuecomment-16953598 .

Fábio de Godoy http://www.animenewsnetwork.com/MyManga/?user=mexicano21

braiam commented 11 years ago
In the current LocalStorage, all the mangas are stored in a single object. The way I understand the storage.sync API, one item can maximum be 4096 (including key length).

No, you are mixing both api's. localStorage is totally different of storage.* api. And, storage.* api has up to 100mB for storage... just ready to fill it up. saw your edit at the end but mail didn't have it

About your solutions, my previous aclaration just make them invalid. why? Because we are planning to use the storage.sync call ''instead'' localstorage. The problem that I'm facing is that the extension use JSON for almost everything list/preference related. So, I'm stuck (maybe just lazy) looking for all the JSON and transform it into something more DB like. I've managed to read all entries and save them in the indexedDB but that's as far I have go. I would work on this on summer when I have more time. Also, this is a way down on my list, since is a feature that would fix a bug... weird way to put it, huh.

@mexicano21 Well, the mirrors mangas list is actually being saved with webDB (one of the new features of 1.3.12) so, the only limitation we have is with the chapter list more specifically. If you run localStorage.manga on the console you will notice a huge JSON object (mine is 785k character long) that is your chapter list, and is the thing I'm trying to address with the implementation I'm working on, first the preferences then the chapter list.

KriPet commented 11 years ago

Where does it say that the storage.* api has that much space. In the link you posted in the opening post, it says that each item in the storage.sync (Not storage.local, so this won't apply to the offline storage) is limited to 4 kB.

My question was how you plan to divide up the huge manga json object to fit into these storage key/value items.

Also, I understand why you would want to move away from json, as you'd need to parse and unparse the data all the time, but am I right in understanding that the value in the storage.* api can be any serializable object? Are the json object not serializable on their own? (Sorry, don't know much about this since it's not what I've been doing in js)

fuzetsu commented 11 years ago

Yeah, from what I can see on the chrome.storage page @KriPet is right. I don't see any mention of those 100MB.

The api has some constants that represent the max storage.

For sync: QUOTA_BYTES ( 102,400 ) For local: QUOTA_BYTES ( 5,242,880 )

http://developer.chrome.com/extensions/storage.html#properties

@braiam Maybe you misread the bytes as KB?

If we're staying away from storing the data on an actual server tied to user accounts then I think something like splitting mangas into multiple objects and saving each as a separate key/value pair using storage.sync (maybe devoting an object to indexing them a bit?) could be a solution :-/ (as KriPet was suggesting).

Honestly I think that @Someone516 was on to something with syncFileSystem

mexicano21 commented 11 years ago

JSON already is javascript, no parsing needed, just loading. A simple eval() is enough (even tough this is not always the most secure way to load JSON data).

2013/4/24 Peter K. notifications@github.com

Where does it say that the storage.* api has that much space. In the link you posted in the opening post, it says that each item in the storage.sync (Not storage.local, so this won't apply to the offline storage) is limited to 4 kB.

My question was how you plan to divide up the huge manga json object to fit into these storage key/value items.

Also, I understand why you would want to move away from json, as you'd need to parse and unparse the data all the time, but am I right in understanding that the value in the storage.* api can be any serializable object? Are the json object not serializable on their own? (Sorry, don't know much about this since it's not what I've been doing in js)

— Reply to this email directly or view it on GitHubhttps://github.com/AllMangasReader-dev/AMR/issues/7#issuecomment-16980294 .

Fábio de Godoy http://www.animenewsnetwork.com/MyManga/?user=mexicano21

fuzetsu commented 11 years ago

@mexicano21 I think KriPet means parsing as in JSON.parse()

Although it's true that JSON is just a javascript object because of chromes security requirements it needs to be parsed from string form instead of just eval-ed right away, which usually involves just checking if its a valid string that follows the JSON format. At least this is my understanding.

braiam commented 11 years ago

Ok, lets set some things straight:

  1. The storage API is the frontend (?) for IndexedDB. Was a blunt of my part using them both like they are the same. http://www.chromium.org/developers/design-documents/extensions/proposed-changes/apis-under-development/managed-storage-api
  2. Yeah, my bad again. I was taking the WebSQL api as reference (which has a limit of 10mB). But if you read carefully the storage api documentation "This value will be ignored if the extension has the unlimitedStorage permission." that we already use, so the only problem is within the sync property but I'm thinking in using something like the current bookmark syncing.
  3. @mexicano21 eval is evil
  4. This issue was to assert that we have what we need to start using the storage API (be it a well documented library, that makes easy to our brain to understand or whatever), but @Someone516 did some weird references to file syncing, then @KriPet took sync from there and we ended like this.
  5. The sync items for manga reading chapter list will include only a unique numerical value, the ordering name of the manga (another thing I want to implement) and the last read link. I think that should fit more or less the 4,096 bytes per item and the about 4kB of global limit. Then add the parameters and whatnot and there you have it.
mexicano21 commented 11 years ago

2013/4/25 Braiam Peguero notifications@github.com

Ok, lets set some things straight:

  1. The storage API is the frontend (?) for IndexedDB. Was a blunt of my part using them both like they are the same. http://www.chromium.org/developers/design-documents/extensions/proposed-changes/apis-under-development/managed-storage-api
  2. Yeah, my bad again. I was taking the WebSQL api as reference (which has a limit of 10mB). But if you read carefully the storage api documentation "This value will be ignored if the extension has the unlimitedStorage permission." that we already use, so the only problem is within the sync property but I'm thinking in using something like the current bookmark syncing.
  3. @mexicano21 https://github.com/mexicano21 eval is evil

If you don't control the origin of the code being evaluated and your script is not in strict mode, it certainly is. But eval() yet has its use cases. Anyway, I was only talking about how easy is to import JSON data into script, not suggesting anything. JSON.parse() is so easy as eval() and fast enough for almost any use case I can think. Faster than any database access or xml parsing, for what it's worth.

1.

  1. This issue was to assert that we have what we need to start using the storage API (be it a well documented library, that makes easy to our brain to understand or whatever), but @Someone516https://github.com/Someone516did some weird references to file syncing, then @KriPet https://github.com/KriPet took sync from there and we ended like this.
  2. The sync items for manga reading chapter list will include only a unique numerical value, the ordering name of the manga (another thing I want to implement) and the last read link. I think that should fit more or less the 4,096 bytes per item and the about 4kB of global limit. Then add the parameters and whatnot and there you have it.

— Reply to this email directly or view it on GitHubhttps://github.com/AllMangasReader-dev/AMR/issues/7#issuecomment-16987743 .

Fábio de Godoy http://www.animenewsnetwork.com/MyManga/?user=mexicano21

KriPet commented 11 years ago

Wait braiam. Are we going to use bookmark syncing (which hasn't worked for me in ages) or storage.sync, your post is unclear on this.

But back to IndexedDB. As far as I can see, IDB and the storage.* API has nothing to do with each other. storage.* is a key/value pair, IDB is a database. Am I mistaken?

Also, while I do understand why you hate eval (it is indeed evil), if what mexicano21 says is correct (That json parsing is faster than database access), then I really don't see how IDB would win out.

I'm interested in seeing how you would implement the numerical ordered id of the mangas. Where would the index be stored?

Also, we seem to be discussing stuff outside the scope of the issue. Sorry for that. The reason I wanted in on the project was to fix the sync that hasn't worked form me. This issue just seemed the most closely related to that (With all the references to sync when talking about IDB)

braiam commented 11 years ago

@KriPet I said like the actual bookmark syncing, with the reason of that the bookmark syncing method use a JSON in the chrome bookmarks that only store:

Since we have size limit, I'm trying to get only the most relevant information that will allow us to get synced. That is, ordering name (not implemented yet), link of last chapter read, and other minor metadata. Also, I would not be syncing everything each time, the sync call only will be used when some changes (addition, deletion, updates) are introduced, or at least that's how I hope it could work.

About the relationship between IndexedDB and the storage API, seems that I don't know what we are talking about anymore. Reading lots of documentation, I didn't found, even using the storage call in the extension, where or how are these values stored or if storage API use IndexedDB somehow, as how was hinted somewhere. So, you are not mistaken, storage is a key/value pairing and indexedDB is a... databases. So, changing the description of the issue again.

The only thing that I found as a feasible solution for us, was a guy in stackoverflow that will use IndexedDB for a lots of data and storage.sync to save/sync the changes. But, lets drop the syncing for now and concentrate on moving ourself out the localStorage.

BTW, the bookmark sync do not work for you? Google fixed the bug that didn't allow bookmarks starting with javascript:void to be synced. But I haven't tested myself...

@mexicano21 I only wanted to say "eval is evil". And you know the pain that was getting rid of it because we can't use it anymore ;). And I know is useful, in actual fact the JSON use it.

KriPet commented 11 years ago

Ok, I understand now. I also saw the guy on SO, but I'm still not sure if IDB can be added to a storage.* directly or if it needs to be converted into a json string. Is the IDB serializable? The SO post seems to suggest so.

And no, the BSync has not worked for me for months. I've tried a lot of things to get it to work, like the tool you had on the forums, and updating to the dev version on both my computers. The issue might be a specific manga using unicode characters or something. I'll try backing up the list and adding only a few mangas. If I find anything I'll be adding a bug report.

fuzetsu commented 11 years ago

I also haven't been able to get BSync to work for me for quite a long time (so long I almost forgot it existed! :-P).

I can see IndexedDB being useful for getting away from localstorage and speeding the extension up but as KriPet is saying the data still needs to be serialized which I think would mean reading/loading the data from the DB and syncing it as JSON. This still leaves the issue of limited storage space.

I'm curious as to why this is being brushed off. The only issue I see is that it's in the 'trunk' and that it's only supported in Chrome 27 and on. In fact it is the method mentioned on chrome's Manage Data page.

Couldn't we just store a .json file (or many) supported by each users google drive. Maybe also include some functionality for creating backups on drive too. I guess it would add a dependency (having a google account) to anyone wanting synced manga lists, but I don't think it would be a very hard one to fulfill for anyone using google chrome anyway.

It wouldn't be the most elegant solution but it wouldn't be the worst either I don't think.

(I guess we should create an issue for this whole syncing conversation...)

EDIT: Just created it

2nd EDIT: I need to learn how to read... chrome.syncFileSystem is only for packaged apps, not extensions.

braiam commented 11 years ago

Ok, I will ignore your response to the sync problem... Well, the IndexedDB can be read/write async, and is one of the mayors bottlenecks on the extension (I've had problems trying to change preferences values while the extension was updating the chapter list; also, AMR needs more time reading/writing to the localStorage than downloading the chapter lists) and having to load the whole JSON object just to add some categories don't seems too efficient for me. @plduhoux correct me if I got it wrong :)

So, I was thinking something like:

Each action needs a counterpart for the sync function. Any troughs about my conception would be useful.

UPDATE: Changed the description to more descriptive and with a cool link.

KriPet commented 11 years ago

Seems nice, but I got a question: Why compare the first entry of the chapter list? Is this because some sites purge their chapters after a while?

Also, how would you propose to make the unique numerical value for the manga+mirror?

braiam commented 11 years ago

Why compare the first entry of the chapter list? Is this because some sites purge their chapters after a while?

Yes, we wouldn't like to keep dead links on our databases.

Also, how would you propose to make the unique numerical value for the manga+mirror?

Using parseInt('string', integer). I made a jsFiddle here http://jsfiddle.net/braiam/k6Qdr/2/ . That way we are sure that mangas have same numerical value in different pc's but different from the rest.