clingen-data-model / data-exchange-topics

For issue tracking and managing configurations of kafka topics in the dx
0 stars 0 forks source link

Request for 2 gene tracker production topics #2

Closed yugen closed 3 years ago

yugen commented 3 years ago

the unc_production group needs to produce to two new production topics:

I need these ASAP since I'm trying to produce events from GeneTrackers production stage now.

larrybabb commented 3 years ago

Can you provide a bit more detail on what content (message formats and scope of data) will be shared on each topic and any kind of security requirements (could the be available to the public?, any clinvar consumer? etc...).

We are getting into an area that I think requires a little planning as it is unclear exactly what the DataExchange support team's responsibility is in terms of managing streams once they are established. For example, are we simply an operations group that takes all requests and responds asap? Are we responsible for cataloging and publishing documentation on the format and scope of data? Do we attempt to provide a common approach or direction on how topics are organized? ...

I'm truly not trying to get in the way. So we will set these two topics up. But we most definitely need to make a plan and set expectations on how topic management and support operates. Right now it is a bit of the wild west so we are concerned that we may need to go back and revisit the support process and oversight for the DataExchange.

larrybabb commented 3 years ago

@toneillbroad if you see a pattern for how @theferrit32 was setting up these topics and you feel confident you can reproduce his approach for the 2 requested topics above, then please let us know if this is possible by end of day.

My sense is that we should simply wait for @theferrit32 to return on Monday and handle it.

Let's have a sync up with @tnavatar, @theferrit32 asap to get our hands around the resource demands and processes for supporting clingen's ecosystem related to the dataexchange topics, documentation, support, and operations so we can provide better clarity and expectations with everyone.

larrybabb commented 3 years ago

@yugen Can this wait until Monday?

larrybabb commented 3 years ago

We can investigate whether this is an area where @sjahl can help us manage in a more effective manner.

yugen commented 3 years ago

@larrybabb,

Data structures for both topics have been extensively discussed with the expected consumers. I have and will continue to provide documentation about the structure and contents of the topics to which UNC produces. That doesn't seem like something you all should have to take on.

The schema for both feeds will be the same, but will have different event types for ease of consumption by the expected consumers. Here are the message schemas (json-schema format) and an example:

I'm actually already researching setting a message broker for intra-UNC app communication. If topic creation/account management is becoming a burden for y'all I'd be happy to look into setting up my own message broker for others in the consortium to consume from. It would also make sense in terms of keeping grant resources nicely separated and freeing Broad up to focus on more compelling work.

larrybabb commented 3 years ago

Thanks @yugen. @toneillbroad and I just heard from @tnavatar that he was fully aware of this and will be reaching out to you later today to get things going. Sorry for the confusion and communication issues on our side (I own that).

yugen commented 3 years ago

@yugen Can this wait until Monday?

Yes.

theferrit32 commented 3 years ago

@yugen both topics are created and permissions assigned. You should be able to read/write to them now and stanford can read from gt-gci.

For gt-precuration-events, I am not sure what service account the website team is using. In staging we have a separate key in use by genegraph vs the website, but I don't see the same sort of account name created for the production website.