Closed FabienD74 closed 5 months ago
Nice and clean ;-)
we also have to change code here:
# Add next departures with their lines
self._attributes["next_departures_lines"] = {}
if self._departure:
for stop in self._departure:
# if self._name.startswith(stop["stop_id"]):
if self._stop["stop_id"] == stop["stop_id"]:
self._attributes["next_departures_lines"] = stop["departure"]
self._attributes["latitude"] = stop["latitude"]
self._attributes["longitude"] = stop["longitude"]
I Think i will have to chase another issue: Startup is taking more than 60 seconds .... And the question is: how to trace ? where do we spend most of the time... ??? Loop? Tables? SQL stateements?....hummm
I did a "soft restart" of home assistant and now it's fine. No more warnings... Strange. So from time to time, it starts in les than 10 seconds, and ..sometime more than 60 ... that's huge
I forgot a bit abuot that one...am running a daily (spook) service to remove orphaned entities so these disappear with me. Can you provide a PR then I can apply it and verify myself for a few weeks. I am also not sure how these would be re-used, the code (that I did not analyse in detail) seems to go to 999 .... so still a lot of sensors are feasible, or ?
On the start-up, same here.... I am not sure which one keeps it longer
It's also my first time using Github..... Right now i'm directly changing the running code ... ;-))))) I hardcoded the "Fill with zeroes" => str(counter).zfill(3) Remeber it is per "device/person" and per "datasource" ... well it is per "Vicinity" when doing the setup of gtfs I 'm not expecting more than 999 stops of the same datasource around the current GPS location .... ;-) ;-) ;-) ... should we ?? ;-)
Well, I move between 3 providers here and e.g. Basel, in Basel there are dozens of stops. I easily end up with 30+ after a 2d visit...so again...how would these be re-used?
Indeed .... now the name is generic .... but not yet re-used i think... arghhh.... ( i will be back ;-) )
And this is just one element, can you imagine the time I spent to find all the data based on source-differences, calendar, calendar_dates, api-keys', pygtfs with errors and incompleteness, adding real-time, protobuf en json sources, etc.etc. Although I copied a bit from gtfs (core), that one proved faulty in a lot of places
btw, the way I work with github
Well i restored HA. your latest version search for zip file at boot, then rebuild DB. it rebuild the whole DB also if i re-create vicinity settings.... I was not able to find out how to have entities updated. Looks like my DB is deleted and re-created all the time.
your latest version
which one?
search for zip file at boot / re-created all the time
? no such problems with me or others, not sure what you are doing, do you have a service call that triggers?
The version retreived testerday 6pm via github. i guess the latest published.... with deletion of sqllite and also managing dates in zip file and "crying" for a zip file at startup ;-) ;-)...
I'll have to try again. I think it was related with my gtfs settings....no more compatibles.. distance over 1000m and/or more than 15 stops around location...
PS: radius = ( 360 self._data.get("radius", DEFAULT_LOCAL_STOP_RADIUS) ) / ( 40000 1000 ) => 1 meter = 360/40000000 = 9e-6 ( a bit higher than 1/130000 = 7.69e-06 )
:-)
I did not make a release yet...the latest version is release 0453 and this is used by a few people (myself of course) the one with the sqlite - check is 'main'
for the radius... where did you find that 'formula' ? EDIT: can add easily but need to know if this is the 'best' :)
Pure logic. Circonference of earth is 40.000 km (hopefully the earth is nearly a perfect sphere, and if we would like to be more precise, we should consider elevation... To retreive some few stops, it's a bit overkill....) => 360 degrees to cover 40.000 km => 360/40.000.000 = 1m
The real formula to compute distance between 2 corrdinates is way more complex ( sin(), cos(), root square....) We cannot ask the DB to compute that on the fly within the SQL statement ;-))))) Currently we use a square area but we could (should!! should!) "post filter" in python... :-)
About ths main topic ( remember the title "prevent flooding HA..."), i created my own class to store "stoplist" in dedicated entity.... and I added "gtfs2_" on all entities created... So we can filter them in the HA recorder. Still in progress
ok...changed to /111111
Same issue again.... deleted all setiings in gtfs. moved your source code into gtfs2 rebooted (to make sure) -> add new integration gtfs2 -> add new source -> 20 minutes to create sqlite then nothing ... no integration available in devices... ok .. may be that's fine. -> add integration-> gtfs2 -> create setup for vicinity-> chose my datasource (just freshly uploaded)-> BOUM
=> DB destroyed => upload in progress...
which version? Which source? Logs? HAOS or HA docker?
EDIT: for logs, if you use portainer you can also examine the 'print' statements from pygtfs doing the unpacking zip>sqlite (they did not implement logging)
i'm running HAOS Source: github, i did a copy paste from my PC directly into HA custom-component/gtfs2.
in the log:
2024-05-19 16:44:50.882 DEBUG (SyncWorker_14) [custom_components.gtfs2.gtfs_helper] Getting gtfs with data: {'extract_from': 'zip', 'file': 'TEC-GTFS', 'url': 'na'}
2024-05-19 16:44:50.883 DEBUG (SyncWorker_14) [custom_components.gtfs2.gtfs_helper] Checking if extracting: TEC-GTFS 2024-05-19 16:44:50.884 DEBUG (SyncWorker_14) [custom_components.gtfs2.gtfs_helper] Checking if file contains only future data: TEC-GTFS.zip
2024-05-19 16:44:50.894 DEBUG (SyncWorker_14) [custom_components.gtfs2.gtfs_helper] Youngest calender date from new files: ['20240412', '20240429'], is: 2024-04-12 00:00:00
2024-05-19 16:44:50.895 DEBUG (SyncWorker_14) [custom_components.gtfs2.gtfs_helper] New file is not containing only newer dates, removing current/copied sqlite
2024-05-19 16:45:05.044 INFO (SyncWorker_14) [custom_components.gtfs2.gtfs_helper] Exiting main after start subprocess for unpacking: TEC-GTFS.zip
2024-05-19 16:45:05.045 DEBUG (MainThread) [custom_components.gtfs2.config_flow] Checkdata pygtfs: extracting with data: {'extract_from': 'zip', 'file': 'TEC-GTFS', 'url': 'na'}
2024-05-19 17:02:09.104 DEBUG (MainThread) [custom_components.gtfs2.gtfs_helper] Getting datasources for path: gtfs2 2024-05-19 17:02:09.105 DEBUG (MainThread) [custom_components.gtfs2.gtfs_helper] Datasources in folder: ['TEC-GTFS'] 2024-05-19 17:02:26.274 DEBUG (MainThread) [custom_components.gtfs2.config_flow] UserInputs Local Stops: {'file': 'TEC-GTFS', 'device_tracker_id': 'zone.gtfs_test_location', 'name': 'GTFS_TES2024-05-19 17:02:26.275 DEBUG (SyncWorker_30) [custom_components.gtfs2.gtfs_helper] Getting gtfs with data: {'file': 'TEC-GTFS', 'device_tracker_id': 'zone.gtfs_test_location', 'name': 'GTFS_2024-05-19 17:02:26.275 DEBUG (SyncWorker_30) [custom_components.gtfs2.gtfs_helper] Checking if extracting: TEC-GTFS
2024-05-19 17:02:26.275 DEBUG (SyncWorker_30) [custom_components.gtfs2.gtfs_helper] Checking if file contains only future data: TEC-GTFS.zip
2024-05-19 17:02:26.282 DEBUG (SyncWorker_30) [custom_components.gtfs2.gtfs_helper] Youngest calender date from new files: ['20240412', '20240429'], is: 2024-04-12 00:00:00
2024-05-19 17:02:26.282 DEBUG (SyncWorker_30) [custom_components.gtfs2.gtfs_helper] New file is not containing only newer dates, removing current/copied sqlite
2024-05-19 17:02:40.271 INFO (SyncWorker_30) [custom_components.gtfs2.gtfs_helper] Exiting main after start subprocess for unpacking: TEC-GTFS.zip
2024-05-19 17:02:40.271 DEBUG (MainThread) [custom_components.gtfs2.config_flow] Checkdata pygtfs: extracting with data: {'file': 'TEC-GTFS', 'device_tracker_id': 'zone.gtfs_test_location', '202
Yep...see the issue... I only checked extracting files not setting up new routes...maybe tomorrow (apéro now)
If ever I would like someone to do is to write a few 'automated' tests, there is soo much to check but I quite often forget a end-2-end check
I think there are 2 big parts in this development. 1) Upload / maintenance of DB 2) Usage of DB.
I'm afraid about anything automatic regarding step 1..... a flag "Automatic Upload" in the settings defauted to false ?
small thing to change...sometimes small things have big consequences
Github Desktop don't find any change in the code.... what have you changed ?
So your PR only change this? ... it btw should also update the unique id
Github Desktop don't find any change in the code.... what have you changed ?
I already had pushed it to main, maybe that is why you cannot see it?
Entity_id ., i have been figting many hours to have working .... ... or .... when HA suddently decide to not use it, and use the name..... when sometimes HA says "not valid"... or something else....try-again, restart, chacks logs, change again, waiting refresh of sensort, try again, fix, reboot, ... the hard way! of-course, like always, not many recents examples to follow .... solution : "try-and-error"
May be now it can be simplified... ;-)... i don't understand the usage of attributes starting with '_' .... are those the real-one used by HA ?? There is a mapping/copy somewhere?
Not sure about the unique_id. may be. (some ppl commented it was for "internal purpose") Some tests also went wrong with duplication of sensors ending with "_2" "_3" "_4". ;-) ( The nightmare of HA ;-) )
i don't understand the usage of attributes starting with '_'
where?
example here: self._stop = stop self._name = self._stop["stop_id"] + "_localstop" + self.coordinator.data['device_tracker_id'] self._attributes: dict[str, Any] = {}
self._attr_uniqueid = "sensor.gtfs2" + self._name self._attr_unique_id = self._attr_unique_id.lower() self._attr_unique_id = self._attr_uniqueid.replace(" ", "") self.entity_id = self._attr_unique_id
how to know the Existing ones "inherited" from HA i guess, VS those created in our code ..... ?
BTW: I'm using "Studio code server" installed via HACS, .. may be there is a setting somewhere to add "real" autocompletion and help ? right now it's just a text editor with multiple tabs... ;-) ;-)
Fabien
I cannot easily see your changes... If i open it it see your changes.... But in my ming it should be compared to my github repository.
self._stop = stop
development choice ..... If used consistently then this helps differentiating _stop from stop. It is not a 'must' but it helps when analysing code.
e.g. if you use self.stop = stop and way down in the code after pasisng things on (part of self) you see
if stop == "12344" : ... then you may have made a typo where it should be self.stop and not stop
IF used consistently that is... I catch myself for not always doing this
On your screenshot, no clue...
So far so good. My sqlite file it still there ... ;-))))
How to test ?
Should i remove the "remaining" ZIP file ?
Replace with original one with "shapes" inside?
create new sensors ?
Test...well... you need a file that adheres to exception (future date) and then load that. What it should do is to keep current sqlite and keep the downloaded zip as well (the pair is a 'must' for pygtfs in certain cases). You can use the service call to update it from a config/www/xyz.zip file
I moved the class "stoplist" in a dedicated file: Here is the latest version with some new features:
I know, too much hardcoded values...
I forgot. we need ;
import geopy.distance
from homeassistant.helpers.event import (
async_track_state_change_event,
async_track_time_interval,
)
from homeassistant.const import (
STATE_UNAVAILABLE,
STATE_UNKNOWN,
EVENT_HOMEASSISTANT_STARTED,
)
sensor created with basic info at startup ( no DB select at HA startup ) added listner on event EVENT_HOMEASSISTANT_STARTED, to finish the init, and perform the first full refresh/update.
=> I donot immediately understand the diff with before. When I restart HA I want (!) the sensor to be loaded 'immediately' with new/current data, isn't this more code (and maintenance and test) for the same target?
added a listner on device tracker to receive new coordinates in real time. added warning( and error ) if refreshed/updated too frequently (warning below 30 seconds, ERROR AND SKIP if <5 seconds) skip update if new GPS coordinates < 5 meters => This is nice for the local-stops but must be configurable then, I have seen rt-provider that do not even push or allow updates below 1 minute...hence my fixed-route sensor refreshes per-definition once per minute and only updates static when that configured refresh freq is hit.
consider data obsolete above 300 seconds, and update/refresh. otherwise silently skip update ;-) ;-) ;-) => should also be configurable, there is no need to refresh e.g. during the night or when not using the datasource. e.g. I have Zou + Basel + Netherlands and only need 1 to be active
Overal I like gimmicks but keep in mind that the complexity quickly increases and one should use HA functionality to refresh (service calls / automation) so you give more control to the end-user
And...finally ... please provide PR with not too many changes, it is quite a challenge to review the impact plus I cannot test it all and would like to avoid releases that 'crash' with various users needing reset or quick follow-up.
Looked a bit at the code, would prefer
Too much topics in this thread, lost its context and original solution proposal rejected (by me)
Describe the solution you'd like gtfs create entities with name of "stops", which lead to create all stops you have ever crossed ;-)))))) Which can be ... A LOT in a "small cities" like Paris, New-york, Tokyo...in just a couple of hours/days ....
if we replace the "stop name" with a counter , we could re-use / refresh existing ones...
Describe alternatives you've considered
Additional context
and