comtravo / ctparse

Parse natural language time expressions in python
https://www.comtravo.com
MIT License
131 stars 24 forks source link

`latent_time=False` works only half-way #126

Open roskoN opened 2 years ago

roskoN commented 2 years ago

Description

Setting latent_time=False still uses latent rules (e.g. "ruleLatentDOY"), and consequently delivers a DateTime relative to the current one. I would expect that none of the latent rules are considered.

What I Did

I run the following command:

 list(ctparse("4.Jan 22", latent_time=False, debug=True))

which delivers the following output:

[CTParse(2023-01-04 22:00 (X/X), (108, 103, 129, 'ruleHHMM', 'ruleDOM1', 'ruleNamedMonth', 'ruleDOMMonth', 'ruleLatentDOY', 'ruleDateTOD'), 24.70392674175538),
 CTParse(2022-12-04 X:X (X/X), (108, 103, 129, 'ruleHHMM', 'ruleDOM1', 'ruleNamedMonth', 'ruleLatentDOM'), -1379.6945574679112),
 CTParse(X-01-X X:X (X/X), (108, 103, 129, 'ruleHHMM', 'ruleDOM1', 'ruleNamedMonth', 'ruleLatentDOM'), -974.2294493597469),
 CTParse(X-X-X 22:00 (X/X), (108, 103, 129, 'ruleHHMM', 'ruleDOM1', 'ruleNamedMonth', 'ruleLatentDOM'), -1379.6945574679112),
 CTParse(2023-01-04 X:X (X/X), (108, 103, 129, 'ruleNamedMonth', 'ruleDOM1', 'ruleDOMMonth', 'ruleLatentDOY'), -464.62600230356384),
 CTParse(2022-11-22 X:X (X/X), (124, 108, 'ruleDOM1', 'ruleDDMM', 'ruleLatentDOY', 'ruleLatentDOM'), -1386.2557536714821),
 CTParse(X-01-04 X:X (X/X), (124, 108, 'ruleDOM1', 'ruleDDMM', 'ruleLatentDOM'), -471.5873006213257),
 CTParse(2023-01-22 X:X (X/X), (108, 103, 108, 'ruleDOM1', 'ruleNamedMonth', 'ruleMonthDOM', 'ruleLatentDOY', 'ruleDOM1', 'ruleLatentDOM'), -283.055702689241),
 CTParse(X-01-22 X:X (X/X), (108, 103, 108, 'ruleDOM1', 'ruleNamedMonth', 'ruleMonthDOM', 'ruleDOM1', 'ruleLatentDOM'), -285.2758563452916),
 CTParse(2022-11-22 X:X (X/X), (108, 103, 108, 'ruleDOM1', 'ruleNamedMonth', 'ruleDOM1', 'ruleDOMMonth', 'ruleLatentDOY', 'ruleLatentDOM'), -1381.2580980306086),
 CTParse(X-01-04 X:X (X/X), (108, 103, 108, 'ruleDOM1', 'ruleNamedMonth', 'ruleDOM1', 'ruleDOMMonth', 'ruleLatentDOM'), -468.07426686688797),
 CTParse(X-X-22 X:X (X/X), (108, 103, 108, 'ruleDOM1', 'ruleNamedMonth', 'ruleDOM1', 'ruleLatentDOM'), -1387.0195868535332),
 CTParse(2023-01-04 X:X (X/X), (108, 103, 108, 'ruleDOM1', 'ruleNamedMonth', 'ruleDOMMonth', 'ruleLatentDOY', 'ruleDOM1', 'ruleLatentDOM'), -462.24216151516646),
 CTParse(2022-11-22 X:X (X/X), (108, 103, 108, 'ruleDOM1', 'ruleNamedMonth', 'ruleDOMMonth', 'ruleLatentDOY', 'ruleDOM1', 'ruleLatentDOM'), -1378.5328933893213),
 CTParse(X-01-04 X:X (X/X), (108, 103, 108, 'ruleDOM1', 'ruleNamedMonth', 'ruleDOMMonth', 'ruleDOM1', 'ruleLatentDOM'), -465.9835442173078),
 CTParse(2023-01-04 X:X (X/X), (108, 103, 108, 'ruleDOM1', 'ruleDOM1', 'ruleNamedMonth', 'ruleDOMMonth', 'ruleLatentDOY', 'ruleLatentDOM'), -457.9653277397515),
 CTParse(2022-11-22 X:X (X/X), (108, 103, 108, 'ruleDOM1', 'ruleDOM1', 'ruleNamedMonth', 'ruleDOMMonth', 'ruleLatentDOY', 'ruleLatentDOM'), -1374.2560596139065),
 CTParse(X-01-04 X:X (X/X), (108, 103, 108, 'ruleDOM1', 'ruleDOM1', 'ruleNamedMonth', 'ruleDOMMonth', 'ruleLatentDOM'), -463.7046680492498),
 CTParse(2022-12-04 X:X (X/X), (108, 103, 108, 'ruleDOM1', 'ruleDOM1', 'ruleNamedMonth', 'ruleMonthDOM', 'ruleLatentDOY', 'ruleLatentDOM'), -1378.5807416488096),
 CTParse(2023-01-22 X:X (X/X), (108, 103, 108, 'ruleDOM1', 'ruleDOM1', 'ruleNamedMonth', 'ruleMonthDOM', 'ruleLatentDOY', 'ruleLatentDOM'), -279.96845298069996),
 CTParse(X-01-22 X:X (X/X), (108, 103, 108, 'ruleDOM1', 'ruleDOM1', 'ruleNamedMonth', 'ruleMonthDOM', 'ruleLatentDOM'), -283.9414417860744),
 CTParse(X-X-22 X:X (X/X), (108, 103, 108, 'ruleDOM1', 'ruleDOM1', 'ruleNamedMonth', 'ruleLatentDOM'), -1385.4914732680768)]

P.S. Thank you for your great work!

sebastianmika commented 2 years ago

Hi,

I very much understand your confusion. latent_time=False suggests that this would not apply any latent resolutions.

However, I am afraid that is not how it is meant and implemented. From the docs:

:param latent_time: if True, resolve expressions that contain only a time
                    (e.g. 8:00 pm) to be the next matching time after
                    reference time *ts*

This parameter only applies to expressions that contain only a time - that is maybe why we called it latent_time. Our use case when building this was simply that we did not want "pure time" expressions to be grounded to some (in that case arbitrary) date.