Closed mcostalba closed 6 years ago
I agree and I would even volunteer. But I think my knowledge of database and server maintenance is just not enough (although I think I could help with stats and optimization).
I agree that some modernization is needed, and it would benefit from more active maintainership. Presumably by adding a few more active maintainers to the official fishtest repo.
@Stefano80, From my point of view, it is not needed to have a single person with all skills of Gary, a small team would be good as well, and reduce the bus-factor of fishtest. @ianfab and @ppigazzini have been doing quite some work on fishtest...
It's unnecessary for a single person to take over the job. A few Python developers can do that. I'm happy to contribute if we have a team.
Yes, if we could convince @ppigazzini to step up, I would gladly join more active development efforts. I was always somewhat discouraged from the inactivity of fishtest. (Without any criticism to Gary, let me be very explicit about that, everybody has a lot to do here...)
I have some spare time only at weekend, and I'm not a developer (I learnt some very basic python only to contribute some patches to fishtest). As many of you I think that looking for a @glinscott clone (developer, server administrator, github repo maintainer, chess developer, etc.) could be unsuccessful, perhaps it's better to try to build a team with all those skills.
I suggest this steps:
ps: I think that atm @ianfab is the person with the bigger knowledge of fishtest and fishtest administration (just below Gary) , but I think also that he is very busy with multivariant stockfish/fishtest (he has contributed well over the 90% of the patches)
Hi @ppigazzini , in principle I agree, I would reorder your plan as follows
I will try to continue contributing to fishtest development, but I do not have much knowledge about server and database maintenance/administration (although at least being able to keep multi-variant fishtest somehow running most of the time) and I am also quite busy for the reasons @ppigazzini mentioned. So I could contribute to the code (even if it might be mostly minor improvements as in my open PRs), but not really to the server administration.
@Stefano80
@mcostalba you never posted in the threads about the stalled fishtest development, so I'm curious to know the new features that you have in mind.
We need a team leader. Someone who gives general direction, otherwise we can't work as a team!
@mcostalba Do you want to be a team leader?
Hi @ppigazzini, I think we need to work on 3 fields
General stability: We don't have a acceptance test environment for fishtest and we actually need one without which fishtest will remain inherently fragile.
Endgame testing: we need a concept to test TB patches and we need a endgames dataset for testing endgame patches.
Tuning: we need a concept to evaluate tuning strategies, and we need to put some hard work on improving SPSA, input from @ilvec would be probably useful.
What are your thoughts?
@Stefano80 I leave at developers the task to suggest new fishtest features chess related.
Some short term goals:
Some medium term goals:
Some long term goals:
Regarding the acceptance test, I think that atm it suffix to follow this workflow:
What about this one?
It is done in go language (actually it is a way for me to learn 'go', because I am totally new at it). It reads from the same MongoDB of official fishtest, fetching data from there (through a private VPN), so it is in strictly read-only mode to avoid any issues for the official fishtest site.
I like it better than current. What does it cX tY after the score?
@Stefano80 crashes and time losses
For what it is worth: This new layout looks great! Strange there is so little enthusiasm.
I find it great too! Should have said with more emphasis!
Looks great!
I have one question, is it possible to include part of the commit ref in the test name? If people (I am guilty of this!) submit multiple tests with altered code on the same branch, we need to click on the test to remember which code version it is. A couple of letters from the commit ref giving test_name_xy instead of test_name might be a bit easier to follow.
On 24 Oct 2017 7:52 pm, "Stefano Cardanobile" notifications@github.com wrote:
I find it great too! Should have said with more emphasis!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/official-stockfish/Stockfish/issues/1267#issuecomment-339094031, or mute the thread https://github.com/notifications/unsubscribe-auth/AWZGfImzGMhpVwVyJCDLhvj9YZMRNNIDks5svjIDgaJpZM4PxvED .
@xoto10 this is a good idea. I will do. Thanks.
Here e go:
I have replace the 3 ellipsis with the first 4 digits of the sha of new commit.
I have added the machines page (open clicking on the number under the green leds):
Looks great!
I had another thought. Is it possible to display the wins and losses for each colour, or is this information not available from cutechess?
On 22 Oct 2017 19:08, "Marco Costalba" notifications@github.com wrote:
What about this one?
[image: image] https://user-images.githubusercontent.com/1099265/31864668-9ce6daae-b761-11e7-9261-2011607f406a.png
- All tests in one page (with infinite scrolling btw)
- Led (green/yellow/gray) on the left to show started, pending or finished state
- New graphic and arrangement of the columns
- Black border of the score box to show LTC tests
- Link rendering in the test description column
- Fancy date format
- Number of active machines for each running test (see the number under the green led)
It is done in go language https://golang.org/ (actually it is a way for me to learn 'go', because I am totally new at it). It reads from the same MongoDB of official fishtest, fetching data from there (through a private VPN), so it is in strictly read-only mode to avoid any issues for the official fishtest site.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/official-stockfish/Stockfish/issues/1267#issuecomment-338497147, or mute the thread https://github.com/notifications/unsubscribe-auth/AWZGfDuV0P84paRJFLjN9uegTiK95qMnks5su4SagaJpZM4PxvED .
@xoto10 no, I don't think it is available in fishtest (I am not sure for cutechess).
Marco, I think it would helpful to have a real-time graph for each test, the x-axis being number of games played, and the y-axis being the current LLR value, along with the upper and lower bounds for passing and failing. That way you could see a test's history, and also see visually how close it was and is to passing or failing. -Bryan
@crossbr I have implemented a python script to do that, see https://groups.google.com/d/msg/fishcooking/0QTFBQJcuas/WQro-FSTAQAJ or https://github.com/vondele/FishTestWatch
@crossbr yes, this is a good idea but it requires an update to the DB (or using an external DB)
@vondele nice! Indeed in the very few free time, I implemented live update functionality. This can't be shown with a screenshot but it means that main page and machine page are not static but change lively while you are watching :-) reflecting changes in the underlying data. I have used a websocket to keep a connection alive between server and browser and push updates to the browser view: learning a lot of new stuff in the process...
@mcostalba Would dates be better displayed in YYYY-MM-DD format? Could you display number of active cores instead of active machines?
- Fancy date format
- Number of active machines for each running test (see the number under the green led)
My suggestion is we get one or two of the smallest things fixed asap. e.g. how do we get the timeout changed back from 30 minutes to 5 minutes?
Btw, this thread was about stalled fishtest. Are we going anywhere with that? I thought there was some kind of agreement that @ppigazzini (and I) would be interested and available to do some work. @mcostalba: thoughts from your side?
I now see that I was mentioned here. I think that fishtest should be open for 2 different tuners and then the worse one could in the future be replaced with new attempt.
Also, some strong opening books could reduce statistical errors significantly. I developed 2moves_strong book and I'm doing statistical analysis for it at the moment.
This is great @mcostalba!
Of course, I'd be more than happy to hand over the keys to fishtest. I'm happy to keep the server up, but I just don't have time to update the code these days. I don't want to make changes because it's so damn stable now :). But if someone else is willing to take it on, that would be great!
@Stefano80 and @ppigazzini I've invited you as collaborators on the fishtest repo. Give me an email at glinscott@gmail.com, and I'll give you credentials to the server.
@glinscott : done @mcostalba @Stefano80 : we need a list of features and milestones
Hi @ppigazzini , @mcostalba : I think the first milestone is to decide what to do with the several PRs open on the repository.
Sorry to say, but ... Not only fishtest development is stalled, also fishtest maintenance is stalled, which is more critical. The workers get Timeout: HTTPConnectionPool(host='tests.stockfishchess.org', port=80): Request timed out. (timeout=5.0) more and more often
Peter, the server is running fine:
The problem should be the type of network traffic from the spsa test, but this could be at ISP level.
@ppigazzini At the moment only 15 workers are working and there is still something to do, so something IS defenitely wrong.
Hi all,
with each new refresh I see different tests running. Sometimes, I see all of them, but sometimes only some of them. This alert is important.
@IIvec Same here.
@IIvec try to stop the "tune_nmb" to view if this solve the problem. I have some free time only in the week end and I prefer to not touch anything on the production server before I figured out the whole configuration. My plan is to write a new fishtest server installer for CentOS and to have some servers as test/backup.
@ppigazzini : OK, stopped, it seems that I anyway have enough data from that test.
@IIvec thank you. BTW I used "stop" but I intended "suspend", Next time I will chose the word more carefully.
Maybe these problems are a general problem atm. I also experience problems reaching other sites or streaming via Fire-TV ...
@joergoster this is already known problem for tuning sessions with many parameters.
@ppigazzini : I guess the problem was the number of games (500000) and not the number of parameters (33).
@ppigazzini thank you for picking up this!
I don't have milestones to give you, you are mainly free to develop in areas where you see possible improvements. Anyhow the critical job, apart from development, is maintenance and in particular fixing bugs and addressing issues (luckily very few, although code base is mostly stalled since years).
In case you are looking for ideas, I'd suggest to post a specific topic on the forum, although, beware, you will receive many feature requests, not all valuable and you have to be prepared to filter them out and to explain why you consider them not acceptable: this is, by far, the most difficult part of maintainer job :-)
@mcostalba : OK, message received. ATM @Stefano80 is reviewing the PRs on GitHub (I'm still not so confident w/ git), I'm in charge for the server administration tasks.
@ppigazzini not sure if this is a server admin related thing... i assume there is more details in some log somewhere...
I'm seeing (since a long time) the following error message on (first) login to fishtest, e.g. to modify the state of a test:
Internal Server Error
The server encountered an unexpected internal server error
(generated by waitress)
somehow I'm nevertheless logged in afterwards, so I can with a few extra clicks ignore it, but would be nice to get rid of it.
In case someone wants to pick them up, here are a few ideas (some of which might have been mentioned before in this thread, I have not checked) I would like to implement or at least think through further but did not have the time yet:
I would also like to get rid of the error message @vondele mentioned which I also often encounter, but I have not investigated yet where this is coming from.
Honestly, I like it a lot the way it is, with a few minor changes: draw percentages, and a next/previous page button at the bottom. And I would like to see more opening book options, particularly various endgames. Maybe 30 books each focused on one kind of ending. Then we can improve our endings and judge endings from afar more accurately. Though an automatic 30 second refresh mode would be nice for following progress. And though completely eye candy, it would be nice to see a graph of how a patch did during testing...its ups and downs...how close it came to passing... Maybe distinguish more clearly between patches testing for Elo and those just trying to avoid regressing. Maybe a darker shade of green, red, yellow for the attempts at Elo gain vs the others.
I did make a bunch of ending books, but they haven't been checked for large advantages and such. Some I haven't even run. And they have repeated positions because I just used our 2-move book and removed pieces (the duplicate removers I tried required that there be games, not just positions). And there are 5 or 6 that someone else made so I could do some testing a couple years ago. There could be all kinds of problems with the books I made, I don't know. But maybe it is a starting point.
I have removed the attachment so as not to confuse anyone. The new endgame books are further down this thread.
Fishtest development is practically stalled since 2 years.
We need someone well versed in phyton and with a good amount of time and energy to dedicate to it. I really don't have time enough and I can just keep SF maintenance, both Gary and Joona are very busy in their day job. OTH fishtest needs improvements and needs a dedicated maintainer.