kernelci / kcidb

kernelci.org common database tools
GNU General Public License v2.0
27 stars 33 forks source link

Intel 0day #331

Open spbnick opened 1 year ago

spbnick commented 1 year ago

Intel 0day is testing a lot of upstream kernels

Origin: 0dayci Contacts: Philip Li @rli9

spbnick commented 1 year ago

@rli9, we're getting invalid submissions from 0dayci in KCIDB playground: image

Could you make sure to call kcidb.io.SCHEMA.validate(data) on the data before passing it to kcidb.Client.submit()?

rli9 commented 1 year ago

Sorry for the invalid data. I will disable the data submission for a while as now we are lack of resource to identify the cause to understand why it generates the invalid data, probably because our schema is not up to date.

spbnick commented 1 year ago

Thank you, @rli9! It's no problem if you send invalid data, it's not that much, and we just discard it. No need to stop the submissions.

The problem seems to be you specifying null for git_commit_name instead of omitting the field altogether. Please start calling kcidb.io.SCHEMA.validate() on your data before submission once you fix that.

Thank you!

spbnick commented 1 year ago

@rli9, we haven't received any data from you for the past three months, at least. Is this still because of the above issue, or is this something new? Could we help you somehow to bring the submissions back?

spbnick commented 7 months ago

@rli9, could we somehow help you bring your submissions back to KCIDB?

rli9 commented 7 months ago

Apologize for the late response. Thanks for checking again. The problem is we are still lack of resource and the implementation is also internal that is tightly integrate with the bot, thus it's hard to get support externally.

I want to consult whether it is ok that the bot sends such invalid data (sorry for not able to fix it in near term)? And does the 0day side need catch up to any new protocol?

spbnick commented 7 months ago

Hi Philip, no problem, thank you for the response!

Yeah, I understand, we all have things to do. You can continue sending the data, but everything failing the validation would be dropped and will not make it into the database, sorry. I can suggest a simple fix for this particular problem: go recursively through your JSON data before submission and remove all attributes with null value. Maybe avoid going into the misc fields, if you have any data there and it's supposed to have null's. But also it's good to add the validation and e.g. log any errors to help catch any new issues.

We have backwards compatibility with older schema versions and are able to upgrade data on the fly, so there's no requirement to upgrade. You can keep using the schema you have now. You would need to upgrade, if you want to use new features. You can see all our schema versions here: https://github.com/kernelci/kcidb-io/tree/main/kcidb_io/schema

You can compare the schema version Python files directly, or generate the particular JSON schemas with the kcidb-schema tool, and compare the final JSON, if you'd like to see the differences.

rli9 commented 6 months ago

thanks for the advice. I have recovered the connection with KCIDB, and there're data sent to both playground and production now. Kindly do a check.

can suggest a simple fix for this particular problem: go recursively through your JSON data before submission and remove all attributes with null value.

Got it, I will further add logic to scan the data before sending it out and to fix any issue that leads to null value.

spbnick commented 6 months ago

Wonderful! Thank you very much, Philip :heart: I actually saw your results coming before I saw your comment :grin: The results seem to look very good, although it would be great to have test logs :+1:

spbnick commented 4 months ago

Looking at the fresh logs we still got a couple problems leading to some of your messages dropping:

image

and:

image

Could you take a look, @rli9?

rli9 commented 4 months ago

Got it, for 1st diagram, the test result is a performance result from kernel-selftests, like kernel-selftests.dma.dma_map_benchmark.avarage_unmap_latency: 0.3. How can I mark this as a performance data?

For 2nd issue, I will resolve it to avoid empty git commit name.

nuclearcat commented 4 months ago

@spbnick btw topic on performance data is valid in near future for KernelCI as well. @padovan have in plans this topic.

spbnick commented 4 months ago

Got it, for 1st diagram, the test result is a performance result from kernel-selftests, like kernel-selftests.dma.dma_map_benchmark.avarage_unmap_latency: 0.3. How can I mark this as a performance data?

We don't have a schema formalised to support this right now, but would be happy to accommodate that, and your data would help! Right now, you can put any performance data into the misc field, as free-form JSON. Then we will be able to see and analyse your reporting needs, and account for them in a new version of the schema. Additionally, if your test doesn't have a clear PASS/FAIL, and only produces the performance data, set its status to DONE when it completes successfully, or to ERROR if it aborts due to e.g. a bug in its code.

For 2nd issue, I will resolve it to avoid empty git commit name.

Thank you, Philip!

spbnick commented 4 months ago

@spbnick btw topic on performance data is valid in near future for KernelCI as well.

@padovan have in plans this topic.

Great! Then you better start sending this data ASAP 😁👍🏻

spbnick commented 4 months ago

Oh, and another thing, @rli9: if you call kcidb.io.SCHEMA.validate(data) on your data before submitting it, you would get an exception if it's invalid, and would be able to deal with it, instead of us having to drop it.