Open aschwanden1 opened 2 years ago
Thanks. I have some questions, and maybe others should weigh in on this.
There are a few test cases in the SDK, but many in PSI/J. Is this meant for both or only for the SDK?
Would it make sense to separate site curation from the way things are displayed? Like have on issue/PR for curating sites and one for the table display described above?
I don't think we want to curate sites in PSI/J in the sense that the purpose of the testing dashboard there is to show the status of tests on all sites that tests are being run on that decide to contribute results.
Similar for the test cases. I would not want to hide test cases. It would seem to defeat the purpose of a site that shows how tests are doing.
Would it also maybe make sense to have a separate issue for how to support multiple deployments?
I think one way to clarify this is to consider the audience for the new tab. It is geared towards potential adopters of PSI/J, to allow them to see if it is running on the systems that are important to them. Also, by showing many systems, we can convey that PSI/J is a good thing to adopt due to its portability and support.
developers of PSI/J will find the tab useful but more likely they would go to the existing testing tab, which provides much more information.
Regarding SDK vs. PSI/J. I think this tab will be useful for both aspects. For PSI/J, we may want to just say yes/no if it is working, or perhaps we could report a few tests that represent common use-cases of PSI/J when it is used by itself (that is, not by an existing workflow management system but as a replacement for a user batch script or python+batchscript workflow (where python generates many batch scripts and submits them). We also discussed showing on the PSI/J dashboard whether any integration tests with other workflow tools is working (e.g. PSI/J + Parsl, PSI/J + RADICAL, ..). This might also showcase PSI/J's utility and popularity.
For the SDK, we probably want to report that each tool is working, and maybe point to some useful passing examples (maybe we take examples from docs/tutorials and make sure they run?).
My preference is that we treat this tab as a prototyping activity and discuss after the Friday meeting.
It is geared towards potential adopters of PSI/J, to allow them to see if it is running on the systems that are important to them. Also, by showing many systems, we can convey that PSI/J is a good thing to adopt due to its portability and support.
Sounds good. Sorry to be pedantic, but I'm unclear about some things:
Regarding SDK vs. PSI/J. I think this tab will be useful for both aspects. For PSI/J, we may want to just say yes/no if it is working, or perhaps we could report a few...
I was thinking that it could be perfectly reasonable to configure which views should be accessible for each deployment rather than necessarily trying to make sense of a view designed for a different deployment.
Good questions: initially we are just going to try to showcase the systems where we are standing up CI, and their early access machines and hopefully few other machines at their sites. So for LLNL, I'm hoping LC will let us post rznevada or similar (EAS for El Cap), and maybe a few of our large linux clusters on the unclassified compute environment. Without seeing the dashboard I'm not sure how much info to show. I think we want to convey in real-time that PSI/J is working properly on 'important' resources to DOE users, with some iteration probably needed to figure out the most impressive or impactful machines to put up on the front page.
I think we do want to have a config for PSI/J separate from SDK, but have common code-base. Is that what you are getting at?
One more comment: I'm not sure how much info we can pack into the tab, but if we could show that PSI/J is running on effectively 'all' compute resources at a site, that would be very cool if the lists don't get too long. Livermore has dozens of machines, so I'm not sure how useful that would be to someone not at livermore, but it migth be useful to convey that Livermore has adopted, or that PSI/J is supported across the data center. the trade-off between having lots of machines vs. a few important ones seems like a good thing to discuss, and maybe look at examples of both to see if visually there are things which lead us to one or the other approach
One more comment: I'm not sure how much info we can pack into the tab, but if we could show that PSI/J is running on effectively 'all' compute resources at a site, that would be very cool if the lists don't get too long. Livermore has dozens of machines, so I'm not sure how useful that would be to someone not at livermore, but it migth be useful to convey that Livermore has adopted, or that PSI/J is supported across the data center. the trade-off between having lots of machines vs. a few important ones seems like a good thing to discuss, and maybe look at examples of both to see if visually there are things which lead us to one or the other approach
It sounds to me that what we want here is not necessarily a view on the dashboard site, but components that we can integrate on the main site and the like. I feel that there is an untold assumption in this discussion that presenting the testing data can only reasonably be done on the testing dashboard. But the dashboard backend serves all the necessary data in json through http, which gets transformed into views on the client side. So this is entirely possible and would allow us to separate PR from developer tools so that both can be tailored to maximize their respective utilities.
I think we do want to have a config for PSI/J separate from SDK, but have common code-base. Is that what you are getting at?
I don't think I had a conscious agenda. I'm trying to understand the problem and part of that is understanding what the constraints are. And part of that is trying to understand if it's reasonable to have a single solution that is both good for engineering and PR. In a sense, I think so, and that's because I believe that our target audience is engineers and scientists, for which the best PR might be good engineering. But that's not generally the case, so maybe it's fine to simply present data differently in different contexts and actually have those presentations/contexts carefully designed separately while using as much common infrastructure as possible.
This is my thinking, we are getting back lots of info from tests, but a subset (pass/fail, name of test / kind of test, machine name, scheduler, compute center) is useful to end users and developers of PSI/J, but the rest of it is really only useful to people who maintain and port PSI/J (all the logs and details and such). So, it seems that the same back end and a well structure front end should be able to share a bunch of code.
According to the team discussion: We need a View that will show machines on the Y Axis and Test cases on the X axis. The yaxis needs an accordian to open and close the machines for each lab site. The X Axis should be test cases such as PSI/J, PSI/J + Swift, PSI/J + RADICAL.
the x axis and yaxis will intersect as a table with rows and columns.
There will be a configuration file that shows which machines are to appear on leftmost column. Also, the config file will show the names of the test cases and contain indexes that tell the code where to retrieve values from.
Furthermore, the config file will allow setting of text strings like the name of the title, intro blurbs, subtitles, etc, for various deployments.