yacy / yacy_search_server

Distributed Peer-to-Peer Web Search Engine and Intranet Search Appliance
http://yacy.net
Other
3.37k stars 424 forks source link

Live music stream gets stuck in crawler que only to end when a termination is issued and crawling. #565

Closed smokingwheels closed 1 year ago

smokingwheels commented 1 year ago

Live music stream gets stuck in crawler que only to end when a termination is issued.

Also had error java memory consumed not enough to..53 mb..2 times today. 2 URL's depth 7. increased jvm to 6000.

Live Music stream to test.

http://s2-webradio.oldie-antenne.de/oldie-antenne/stream/aacp

Virus total has same problem with URL.. https://www.virustotal.com/gui/url-analysis/u-b2ee8ddb3bb18e5703ad760bdd91f73e16115300877d70769aa98940865f5328-1677233509

Result a video mkv with sound from link in browser looking at the results of my crawl of the help . Nextcloud . com forum problems slowness to load its discourse format I have ~50% index crawler errors.

A Traceroute from the US to nextcloud.com in Germany is over 2500 mS DNS also appears to be slow also. @Orbiter is this a problem for YaCy?

Anyone is welcome to read and comment offer advise.

I have only really just load tested yacy for last 16 years, I do use it here or there. Im working on a Windows version but there is an issue with packaging java into the downloaded version.

My current Nextcloud at home is full and turned off until a test and buy larger storage space and configure it in an untested mode by Mirroring the data on the server.

https://help.nextcloud.com/t/forum-performance-still-a-problem/145158/34

mkv virustotal: Analysis in progress started 55 mins usually takes 1 min.

http://cloud-party.undo.it/index.php/s/5smdBiGgAmKQNtG

Error index table 0007: not enough ram..java vm topped out at 2 gb's. garbage collection not working by the looks.

http://cloud-party.undo.it/index.php/s/NwLmL3qqekBEDw8

smokingwheels commented 1 year ago

Crawling same sites depth 4. Errors maybe DNS.

Pending in Crawler discourse.pi-hole.net 38030/7350/1929/72 URLs Pending in Crawler help.nextcloud.com 31527/20516/1865/46 URLs Pending in Crawler snapcraft.io 26329/11107/0/164 URLs Pending in Crawler nextcloud.com 11902/10541/0/1893