Open t0mcat1337 opened 9 years ago
Feature, articles are sorted by date added (id) which is a requirement for fast auto paging. This happens only in the beginning, afterwards it should basically be sorted by date.
Search the bug tracker for a full explanation ;)
So I got u right that articles are not really sorted by date (like a "Select * from ...order by date DESC") when they r displayed but only when they r inserted to the DB what makes the primary key (I think that is what u mean with ID) the sorting thing? If this is right I see that sorting as I would expect in folders is excluded by this design right now. So it is absolutly impossible to make sorting more flexible? Adding and/or create an Index to the date field in the DB is not an option?
I could imagine that more people r confused about this behaviour as they see articles not sorted by date despite the app announces them to be
see https://github.com/owncloud/news/issues/115
The issue is basically that the offset is invalidated when you mark items read. The only solution I can see is to load everything which would defeat the purpose of autopaging and make the app very slow
I'm currently trying to dig into the news app code in special and Ownclouds AppFramework in common as I have common experiences in PHP/Ajax aso. development for a couple of years now. Actually I'm thinking of something like a concatenated ID Field in the "_news_items" Table.. somehow this way: New_ID = "pub_date-id" ('1439996640-237113') which could be used as the unique id field itself or just something like an "orderHelper" Could this be a valid approach?
I try to make those changes by myself actually but as I've just started right now I'm fighting a little with the AppFramework ("findEntities" ;) )
Yes, would be a valid approach :)
GZ :)
Hm, basically you could also
ORDER BY pub_date, id DESC
I've tried that once, however it didn't work out. Can't remember the catch anymore
Ah right, that issue was the offset because you can't do:
WHERE pub_date < 123 OR IF pub_date EQUAL < id
Your approach would solve that
Some thoughts: since you are combining 2 64 bit long numbers (20 digits), you'd need to pad them with zeros on both ends to avoid a string sorting like this:
Combining both padded numbers would therefore turn into a 40 characters long string. Your example (1439996640-237113) would therefore need to become 0000000000143999664000000000000000237113.
findEntities does nothing more than map the result from the sql query (assoc array) onto the data objects by using the entity's fromRow method: https://github.com/owncloud/core/blob/master/lib/public/appframework/db/mapper.php#L318
To test your stuff, add a new 40 characters wide text field much like the guid_hash https://github.com/owncloud/news/blob/master/appinfo/database.xml#L211 to the same file and increase the version number to trigger the database migration (e.g. 6.0.2): https://github.com/owncloud/news/blob/master/appinfo/info.xml#L10 :
<field>
<name>sort_order</name>
<type>text</type>
<length>40</length>
<notnull>false</notnull>
</field>
Then add the field in pascalCase to the item entitiy: https://github.com/owncloud/news/blob/master/db/item.php#L59 :
protected $sortOrder
Finally each time when a new item is inserted:
you'd also need to update the field, e.g.:
$item = $this->itemMapper->insert($item);
$item->generateSortOrder();
$item = $this->itemMapper->update($item);
the generateSortOrder method would ofc have to be created on the Item class first and would look something like this:
public function generateSortOrder() {
$this->sortOrder = str_pad($this->pubDate, 20, STR_PAD_LEFT) . str_pad($this->id, 20, STR_PAD_LEFT);
}
Once you hit the JS part, notify me
sorry for my delay but I was late at home yesterday ;) Lets see what u wrote:
Hm, basically you could also ORDER BY pub_date, id DESC I've tried that once, however it didn't work out. Can't remember the catch anymore
This was the first thing I tried, but then the articles weren't sorted by date in the GUI despite they where, when I fired the resulting SQL directly to the DB. So I thought that there must be some mechanism which sorts the results from the DB afterwards (the JSON Object - as seen in Firebug - was sorted by ID) which tooks me to the "findEntities"...
Some thoughts: since you are combining 2 64 bit long numbers (20 digits), you'd need to pad them >with zeros on both ends to avoid a string sorting like this:
1 10 2 3 30 41 5
Combining both padded numbers would therefore turn into a 40 characters long string. Your >example (1439996640-237113) would therefore need to become >0000000000143999664000000000000000237113.
If one don't need a seperator (like the "-") couldn't this field be a BIGINT? And as the first part is the unix epoch which will always be the size of 10 digits (for a very long time ;) ) will this really be necessary?
findEntities does nothing more than map the result from the sql query (assoc array) onto the data >objects by using the entity's fromRow method: https://github.com/owncloud/core/blob/master /lib/public/appframework/db/mapper.php#L318
So "findEntities" doesn't do some sort of sorting? Like "sort by the first field in the resulting dataset"? I'm still wondering why the articels in the GUI where sorted different, than when firing directly to the DB (as described above)...
Thanks a lot for the little documentation about how using this stuff. As I said, never worked with this before ;) Then lets give it a try (hoping I will manage this ;) )
BIGINT is usually a 64bit number, check mysql docs:
BIGINT[(M)] [UNSIGNED] [ZEROFILL] A large integer. The signed range is -9223372036854775808 to 9223372036854775807. The unsigned range is 0 to 18446744073709551615.
The sorting in the GUI is done using Angular: https://github.com/owncloud/news/blob/master/templates/part.content.php#L24 and https://github.com/owncloud/news/blob/master/js/controller/ContentController.js#L89
In case you order by sortOrder https://github.com/owncloud/news/blob/master/js/controller/ContentController.js#L89 would have to return -sortOrder and sortOrder respectively.
The GUI/JS stuff is a bit more complicated, you will run into autopage issues. Check these files:
sooo... what I managed so far:
1.) Inserted a new field "sort_helper" to the table oc_news_items (BIGINT, Not signed)
2.) Manually updated this field, setted it to: CONCAT(pub_date,id) for the whole table. Looks like this now f.e.: '1439969520236639'
3.) Made the XML changes u mentioned above and so forced an OC Update
4.) Changed the SQL in "makeSelectQuery" to : ....'ORDER BY items
.sort_helper
' . $ordering;
5.) Debug "findAllFolder" by inserting some fwrites and found out, that the resulting object of "findEntitiesIgnoringNegativeLimit" now is in the right chronological order, as the SQL would assume.
6.) Debugged the same way step by step till the index function of the itemcontroller. At this point the sorting is still correct as in the SQL.
BUT: the articles (items) in the GUI stay NOT sorted in chronological order though. So I assume, now something on the client side (some js) is responsible for this re-sorting, right?
I just realized, that the "itemController" correctly includes the new field "sortHelper":
.
.
[items] => Array
(
[0] => OCA\News\Db\Item Object
(
[guidHash:protected] => 4b4b121f7ac464f0a23ea0e5b56934ef
[guid:protected] => 4b4b121f7ac464f0a23ea0e5b56934ef
.
.
[pubDate:protected] => 1439969520
.
.
[lastModified:protected] => 1439973912
[sortHelper:protected] => 1439969520236639
.
.
)
BUT the JSON Object in Firefox doesn't??
In between times I simply tried to change these lines:
https://github.com/owncloud/news/blob/master/templates/part.content.php#L24 and https://github.com/owncloud/news/blob/master/js/controller/ContentController.js#L89
from "id" to "pubDate" (just to give it a try ;) ) but of course this didn't work... I got an ugly JS Error concerning something in Angular that I'm far away from understanding.
Ok.... after fighting a little with this Angular stuff without any really success I simply deleted this:
orderBy:[Content.orderBy()] track by item.id"
here https://github.com/owncloud/news/blob/master/templates/part.content.php#L24
and now - together with my formerly mentioned changes - the GUI seems to respect the chronological order from the origin SQL...
Don't really know if this is a really valid solution... but could be an approach? :S Will hold an eye on it and see how it now behaves...
Perhaps simplyfied this a little: By changing the SQL in ItemMapper.makeSelectQuery this way:
...ORDER BY CONCAT(items
.pub_date
,items
.id
)
there seems to be no need of an extra field in the DB...
Concat could work but you'd still need to pad it with zeroes and I think the sorting by varchar is probably already slow enough.
As for the GUI the orderBy is probably not needed since everything gets reset on reload and we add new stuff in order.
A right, CONCATs result is a string.. ok.. then lets cast it before sorting like this:
order by cast(concat(pub_date,id) as UNSIGNED)
Tested it in a little demo table with 2 integer fields and numbers with different length. Should work....
Well, that won't work since you're overflowing ;)
Like I said you need to use a varchar, but you need to pad it with 0 to deal with the unwanted string sorting algorithms.
Overflowing could be indeed an impact... but IMHO just theoretically because: Unsigned BIGINT in MySQL can be max
18.446.744.073.709.551.615
In my example the casted INT will be:
1.439.969.520.236.639
far away from the maximum.
When I'm thinking right, the last valid ID in the item table could then be
9.999.999.999
Reaching the ID 10.000.000.000 will result in an overflow.
Is it realistic that this will ever happen?
Verified this with my test table (one INT field as PK named 'ID', two BIGINT fields named 'inta' and 'intb') with a few rows of example data like occuring in the oc_news_items. This is my result:
SELECT cast(concat(inta,intb) as UNSIGNED) t,inta,intb
FROM test
order by t
results in :
t | inta | intb |
---|---|---|
144014847223 | 1440148472 | 23 |
1440148472238396 | 1440148472 | 238396 |
1440148472238397 | 1440148472 | 238397 |
14401484722383970 | 1440148472 | 2383970 |
14401484729999999999 | 1440148472 | 9999999999 |
18446744073709551615 | 1440148472 | 10000000000 |
As u can see, the concat overflows in the last row, giving the max possible BIGINT value as result...
Whart do u think about that?
Yes an overflow is actually possible on larger installations that run more than 5 years. Out of the roughly 20 digits 10 are occupied for Unix time. The rest is basically 10 billion possible id's. Let's say the install runs 10 years, that's 120 months, so we're left with 80 million ids per month. Assume an install with 20000 users, then you're left with 4000 articles per month and per user. Let's say an average feed has 500 articles per month then only having 8 feeds per user would be enough to make it overflow in 10 years. If there are more feeds than it even happens sooner
Another thing: I did some calculations a while ago and I realized that 32 bit integer ids were too little so I upped to 64 bit. Given that unixtime overflows a 32bit integer in 2038 (google 2038 year problem) you can see that this is a bad idea ;D
Yes an overflow is actually possible on larger installations that run more than 5 years. Out of the roughly 20 digits 10 are occupied for Unix time. The rest is basically 10 billion possible id's. Let's say the install runs 10 years, that's 120 months, so we're left with 80 million ids per month. Assume an install with 20000 users, then you're left with 4000 articles per month and per user. Let's say an average feed has 500 articles per month then only having 8 feeds per user would be enough to make it overflow in 10 years. If there are more feeds than it even happens sooner
Which leads me to my origin quistion: Is such a scenario really realistic? 10 years run time? Without any re-installation? 20000 users? each with 500 articles per month? What brings me to another question: What about houskeeping? Will the articles stay in the table for ever? When there r really billions of datasets in this table, won't there be other problems to be expected as just "a little overflow during cast"? Or are they DELETEd after beeing read an amount of time later? In this case isn't there a possibility to re-use IDs in MySQL? Can't really remember...
Of course dealing with VARCHAR is a valid way, I'm with u, but it's a little more complicated as u correctly mentioned (filling with 0). I like the KISS principle a lot (Keep it simple, stupid) ;)
As u r the boss of this (btw. fantastic) OC-App u decide which way to go... ;) Just wanted to share my thoughts
Another thing: I did some calculations a while ago and I realized that 32 bit integer ids were too little so I upped to 64 bit. Given that unixtime overflows a 32bit integer in 2038 (google 2038 year problem) you can see that this is a bad idea ;D
lol Absolutly ... I wish u, that ur app will be used much longer than 2038 and then this is something what REALLY will happen, with no discussion
Yes, scenario is realistic, I know of installs with 10000 users :)
It's better to keep this in mind than to deal with weird bug reports later on ;)
Yes, scenario is realistic, I know of installs with 10000 users :)
Ah really? Big GZ for this!
It's better to keep this in mind than to deal with weird bug reports later on ;)
Yes, absolutly correct... just wanted to get an idea of how much the probability is for hitting the overflow... if it where REALLY little it won't be really useful to make more/complicate work than necessary (which then could again include other bugs ;) )
Nevertheless...how does housekeeping work?
Hi everybody,
I have multiple feeds organized in folders. When clicking on the folder title all (unread) articles of all feeds in this folder are displayed as expected, BUT: the sort order (newest to oldest) is messed up... (see screenshot).
Bug or feature? ;)
Environment: Owncloud 8.1.1 (stable) News 5.3.9 Browser: Firefox 40 (or any other) PHP 5.5.9
Greetings T0mc@