orientechnologies / orientdb

OrientDB is the most versatile DBMS supporting Graph, Document, Reactive, Full-Text and Geospatial models in one Multi-Model product. OrientDB can run distributed (Multi-Master), supports SQL, ACID Transactions, Full-Text indexing and Reactive Queries.
https://orientdb.dev
Apache License 2.0
4.73k stars 869 forks source link

Better support for time expression #7081

Closed lvca closed 3 years ago

lvca commented 7 years ago

It would be cool having something like:

select * from Log where date >= date() - 1h

In this case h stands for hour. We could have d for days. This query extract all the logs of last 24h:

select * from Log where date >= date() - 1d
Eric24 commented 7 years ago

Interesting. Maybe look at momentjs, with a syntax like: date().add(1, 'd') or date().subtract(5, 'h'), plus options like: date().startOf('Y') and date().endOf('M')

saeedtabrizi commented 7 years ago

Thanks to @lvca , @Eric24 for recommendation . let me add INTL and support Calendars (Like the gregorian,japanese,islamic , persian and etc... ) may useful in date and time computation . its a big IMPORTANT to achieve more users in global market .
i think we have following syntax exactly :

add : select from Log where date >= calendar('persian',date).add('YEAR',5).toDate() add : select from Log where date >= calendar('hijri',date).add('MONTH',8).toDate() add : select from Log where date >= calendar('hijri',date).add('DAY',8).toDate()
add : select
from Log where date >= calendar('persian',date).add('HOUR',5).toDate() add : select from Log where date >= calendar('hijri',date).add('MINUTE',8).toGMTDate() add : select from Log where date >= calendar('hijri',date).add('SECOND',8).toISODate() add : select * from Log where date >= calendar('hijri',date).add('WEEK',8).toUTCDate()

subtract : select from Log where date >= calendar('persian',date).subtract('HOUR',6).toDate() subtract : select from Log where date >= calendar('persian',date).add('DAY',5).subtract('DAY',3).toDate() // returns 1395-02-01 subtract : select from Log where date >= calendar('persian',date).subtract('DAY',3).toFormat("yyyy-mm-dd") // returns 139511-02 subtract : select from Log where fdate >= calendar('persian',date).subtract('DAY',3).toFormat("ct-yyyy-mm-dd") // returns persian-139511-02 subtract : select from Log where fdate >= calendar('persian',date).subtract('DAY',3).fromNow() ; // returns 8 days ago subtract : select from Log where date >= calendar('persian',date).subtract('DAY',3).fromDate(anotherDate) ; // returns 5 days ago

startOf : select from Log where date >= calendar('persian',date).startOf('MONTH').toDate() // returns 1395-01-01 endOf : select from Log where date >= calendar('persian',date).endOf('MONTH').toDate() // returns 1395-12-29

startOf : select from Log where date >= calendar('persian',date).startOf('Week').toDate() endOf : select from Log where date >= calendar('persian',date).endOf('MONTH').toDate()

toCalendar : select from Log where date >= calendar('persian',date).toCalendar('persian',date).toDate() ; // convert a calendar to another calendar toCalendar : select from Log where date >= date.toCalendar('persian').toDate() ; // convert a calendar to another calendar toTimeZone : select from Log where date >= date.toCalendar('persian').toTimeZone('+3:30').toDate() ; toTimeZone : select from Log where date >= date.toCalendar('persian').toTimeZone('Asia/Tehran').toDate() ;

firstWorkingDayOf : select from Log where date >= calendar('persian',date).firstWorkingDayOf('WEEK').toDate() lastWorkingDayOf : select from Log where date >= calendar('persian',date,options:{holidays:[date1,date2,date4,...], workdayofweek:['sat','sun','mon','tue','wed'],...}).lastWorkingDayOf('MONTH').toDate()

isWorkingDay : select * from Log where calendar('persian',date,options:{holidays:[date1,date2,date4,...], workdayofweek:['sat','sun','mon','tue','wed'],...}).isWorkingDay('MONTH') = false // returns boolean true or false .

isHoliday : select * from Log where calendar('persian',date,options:{holidays:[date1,date2,date4,...], workdayofweek:['sat','sun','mon','tue','wed'],...}).isHoliday('MONTH') = true // returns boolean true or false .

toTimeSpan : select * from Log where anyhour >= calendar('persian',date,options:{holidays:[date1,date2,date4,...], workdayofweek:['sat','sun','mon','tue','wed'],...}).toTimeSpan(anotherDate).TotalHours // returns time span hours like the 143 Hour .

toTimeSpan : select * from Log where anydays >= calendar('persian',date,options:{holidays:[date1,date2,date4,...], workdayofweek:['sat','sun','mon','tue','wed'],...}).toTimeSpan(anotherDate).TotalDays // returns time span hours like the 52 day .

it seems to be a new PROPOSAL issue :)

Welcome to comments :)

lvca commented 7 years ago

@saeedtabrizi thanks for the contribution. We're super busy with 3.0, so I think this will go in 3.1 unless you don't want to do it. In this case, please get the develop branch with the new SQL engine.

saeedtabrizi commented 7 years ago

@lvca With pleasure . i start it at asap in this week . god bless you .

saeedtabrizi commented 7 years ago

@lvca there is an important question , can i use the ICU4J library in orientdb dependencies ? i need to know is it legal and allowed for orientdb 3.0 release . i try to implement it before 3.0 release .

Thanks Saeed

saeedtabrizi commented 7 years ago

@lvca i'm waiting for your reply .

lvca commented 7 years ago

I prefer to don't introduce new libraries if it's possible. How much stuff would you use from ICU4J?

saeedtabrizi commented 7 years ago

@lvca almost of all calendars and time api using the ICU4J . it supports a nice range of the unicode and international calendars , timezone data that i need to implement in calendar .

base on ICU project features :

Code Page Conversion: Convert text data to or from Unicode and nearly any other character set or encoding. ICU's conversion tables are based on charset data collected by IBM over the course of many decades, and is the most complete available anywhere.

Collation: Compare strings according to the conventions and standards of a particular language, region or country. ICU's collation is based on the Unicode Collation Algorithm plus locale-specific comparison rules from the Common Locale Data Repository, a comprehensive source for this type of data.

Formatting: Format numbers, dates, times and currency amounts according the conventions of a chosen locale. This includes translating month and day names into the selected language, choosing appropriate abbreviations, ordering fields correctly, etc. This data also comes from the Common Locale Data Repository.

Time Calculations: Multiple types of calendars are provided beyond the traditional Gregorian calendar. A thorough set of timezone calculation APIs are provided.

Unicode Support: ICU closely tracks the Unicode standard, providing easy access to all of the many Unicode character properties, Unicode Normalization, Case Folding and other fundamental operations as specified by the Unicode Standard.

Regular Expression: ICU's regular expressions fully support Unicode while providing very competitive performance.

Bidi: support for handling text containing a mixture of left to right (English) and right to left (Arabic or Hebrew) data.

Text Boundaries: Locate the positions of words, sentences, paragraphs within a range of text, or identify locations that would be suitable for line wrapping when displaying the text.

so it useful library to implement wide range of abilities .

you can see the ICU4J Feature Chart here and also performance key notes

lvca commented 7 years ago

I've just checked, it's 11MB. In practice is big as much as the entire OrientDB core! Too big dependency for such minor role. Could you use the std Java date api?

saeedtabrizi commented 7 years ago

@lvca i'm using ICU4J via maven that i taken from Maven Repository , its shows the jar file is only 2.1MB . is it ok ? if i want to implement the same abilities and same calendar data , so i think we Reinventing the wheel .

lvca commented 7 years ago

ICU4J has a non standard license (IBM?) that could break our Apache2. Any expert here?

saeedtabrizi commented 7 years ago

@lvca as i see in Oracle and the Who uses ICU :

Who Uses ICU? The following is a list of products, companies and organizations reported to be using ICU. If you have any feedback on this list (corrections, additions, or details), please contact us (on icu-support).

Companies and Organizations using ICU ABAS Software, Adobe, Amazon (Kindle), Amdocs, Apache, Appian, Apple, Argonne National Laboratory, Avaya, BAE Systems Geospatial eXploitation Products, BEA, BluePhoenix Solutions, BMC Software, Boost, BroadJump, Business Objects, caris, CERN, CouchDB, Debian Linux, Dell, Eclipse, eBay, EMC Corporation, ESRI, Firebird RDBMS, Free BSD, Gentoo Linux, Google, GroundWork Open Source, GTK+, Harman/Becker Automotive Systems GmbH, HP, Hyperion, IBM, Inktomi, Innodata Isogen, Informatica, Intel, Interlogics, IONA, IXOS, Jikes, Library of Congress, Mathworks, Microsoft, Mozilla, Netezza, Node.js, OpenOffice, Oracle (Solaris, Java), Lawson Software, Leica Geosystems GIS & Mapping LLC, Mandrake Linux, OCLC, Progress Software, Python, QNX, Rogue Wave, SAP, SIL, SPSS, Software AG, SuSE, Sybase, Symantec, Teradata (NCR), ToolAware, Trend Micro, Virage, webMethods, Wine, WMS Gaming, XyEnterprise, Yahoo!, Vuo, and many others.

Apache Projects Harmony, Lucene search library, Solr search engine server, PDFBox library, Tika metadata toolkits, Xalan XSLT, Xerces XML

Products from IBM DB2, Lotus, Websphere, Tivoli, Rational, AIX, i/OS, z/OS

Ascential Software, Cloudant, Cognos, PSD Print Architecture, COBOL, Host Access Client, InfoPrint Manager, Informix GLS, Language Analysis Systems, Lotus Notes, Lotus Extended Search, Lotus Workplace, WebSphere Message Broker, NUMA-Q, OTI, OmniFind, Pervasive Computing WECMS, Rational Business Developer and Rational Application Developer, SS&S Websphere Banking Solutions, Tivoli Presentation Services, Tivoli Identity Manager, WBI Adapter/ Connect/Modeler and Monitor/ Solution Technology Development/WBI-Financial TePI, Websphere Application Server/ Studio Workload Simulator/Transcoding Publisher, XML Parser.

Products from Google

Web Search, Google+, Chrome/Chrome OS, Android, Adwords, Google Finance, Google Maps, Blogger, Google Analytics, Google Groups, and others. Products from Apple

Mac OS X (OS & applications), iOS (iPhone, iPad, iPod touch), watchOS & tvOS, Safari for Windows & other Windows applications and related support, Apple Mobile Device Support in iTunes for Windows. Products from Microsoft

Islandwood (Windows bridge for iOS apps) Products from Harman/Becker

The following car brands are using ICU via the Harman/Becker automotive software: Alfa Romeo, Audi, Bentley, BMW, Buick, more... Related Projects There are also some related projects that wrap the existing functionality of ICU.

Is not enough yet ?

smolinari commented 7 years ago

My 2 cents is, the manipulation of date and datetime values and formating should be handled within userland and should not be an integral part of the database.

Scott

Eric24 commented 7 years ago

@smolinari - I tend to agree. I suggest that only date functions that are specifically "useful" in running queries should be considered for the database core (of course, the definition of "useful" is the key here).

For years, I have always designed databases where the only "date/time" values actually used in the database were UTC values, as either epoch timestamps or native date/time fields (if available). All other date/time functions (calculating "business" dates or holidays, adding or subtracting time, converting to/from a particular time zone, etc.) were always handled externally, with the results of those calculations being sent to the database as a UTC timestamp, and any "post-query date math" was done by the client from the UTC timestamps that were returned.

My original comment about moment.js as a model was in direct response to the original suggestion of having a way to add/subtract days/hours/etc. Personally, I'm not sure I'd use such functions, but then again, my use cases aren't everyone's use cases. So assuming there are good reasons to do date math within a SQL or JS script, the syntax of moment.js (especially in the JS world) is very popular and well understood.

saeedtabrizi commented 7 years ago

@smolinari , @Eric24 - in some cases with the international users like our country or middle east we need to have calendar calculation in the server side because we are using different calendar type. this recommendation provides a facility for database developer to handle localized problems .

in the RDBMS (MS sql Server , Oracle DB , ...) world developers have access to the same functionality for processing the date and calendar calculation.

@Eric24 the good news that i'm writing the my OCalendar class like the momentjs syntax .

saeedtabrizi commented 7 years ago

dear @lvca , @luigidellaquila - I'm ready to push my code in a develop_calendar branch now . but i'm waiting to your decision about using the ICU4J . i can implement this functionality with another way . Now orientdb changes with : 1- add ICU4J to pom . 2- added package com.orientechnologies.orient.core.sql.method.intl; 3 - added OTimeSpan , OCalendar , OSQLMethodToCalendar classes to intl package . 4 - implement source and test for add , subtract , toCalendar , toDate , toTimeZone , startOf , endOf , format , toTimeSpan , fromNow methods in OCalendar class . 5 - OSQLMethodToCalendar accepts 3 arguments ( for current version) as i said in previous comments .

I'm pleasure to hear from you .

lvca commented 7 years ago

Hi @saeedtabrizi, let me involve other OrientDB committers on this.

tglman commented 7 years ago

Hi,

I checked the lib and i noticed i explored already it for the UTF8 Encoding, it seems well supported and with no additional dependency, so for me is ok to add this dependency.

Bye

lvca commented 7 years ago

Ok, unless any other committer has something against it's ok for me too. @saeedtabrizi we're waiting for the PR ;-)

saeedtabrizi commented 7 years ago

@lvca , @luigidellaquila it's done .

lvca commented 7 years ago

@saeedtabrizi Since OrientDB 3.0 is based on Java8, we could use the new Java Time API: https://docs.oracle.com/javase/tutorial/datetime/TOC.html. How are ICU4J API better than them?

saeedtabrizi commented 7 years ago

@lvca As i know and explored , in the JAVA 8 time or util package there is no builtin implementation for supporting international calendars (like the Chinese (over 1 billion people) , Indian (over 1 billion people), Islamic (over 1 billion people), Persian (over 80 million people), Japanese (over 100 million people) , ....) (only Gregorian calendar is available currently) and we must re implement multiple calendars and localization in orientdb (reinventing the wheel).

Remember the world population is only 7 billion currently and at least the half of population in the world can be the users lives with the different culture and calendar that java does not built in support their calendars in JDK (that was painfully) . if i want to say only a weakness the java vs .net framework , certainly i say supporting the complete builtin globalization in java .

Also we are introduced OCalendar class that wrapping multiple calendar implementation methods . so if in the future java decided to support international calendars , we can use it without any problem and remove ICU4J dependency , even we don't care about other useful and performance in ICU4j package .

Overall , i prefer to use ICU4J that using and maintaining via a world leader IT companies . also ICU project is used in several open source and popular projects like the NodeJs and also big product or projects .

lvca commented 7 years ago

Super clear thanks. Also the fact we're wrapping ICU allows us to change to something else in case.

TropicalPenguin commented 7 years ago

It would be nice to also expose an ISO 8601 calendar.

MenoData commented 6 years ago

If you think that ICU4J is too big then you might consider to have a look at my open-source calendar library Time4J. On the realm of date and time, it can do many things more accurate and performant than ICU4J and is based on the same i18n-data (actually CLDR-v32). It also offers duration formatting (in 90+ languages) and many calendars. Link to the calendar-part of the API:

http://time4j.net/javadoc-en/net/time4j/calendar/package-frame.html

By the way, I am skeptic if Oracle will ever implement calendars like the persian one (see this JDK-issue which was closed in July 2017 with the comment "Won't fix as not a common request. ").