Logging seems like to require some configuration steps but despite of the configuration the logging interface logger.log(info/debug/error) should be the same.
But since the configuration of the logger requires file or link handling, it'd better be tested I guess...?
This mixture of filter and indexing can maybe refactored into more explicit way...?
But I also think it is already working conveniently already... hmm...
Maybe we can collect the comments into docstrings. Copilot is good at it : D
And the documentation in the readme might be sufficient but we probably need to clean that up too.
Here is a list of some sections that might be needed in the documentation.
Kafka Consumer/Message Handling
It seems quite straight forward but since it's the core part of it. Maybe we can document it better.
Shall we write some mermaid graph : D....??
Brainstorming what should be done for refactoring. Feel free to shoot down the ideas : D
Encapsulating partial code into modules/functions
For example, these can be split into another module...? https://github.com/SciCatProject/scicat-filewriter-ingest/blob/3c5c22ea4e606719e55b7999e407f18470b4e76c/online_ingestor.py#L116-L124
Kafka Consumer/Message Handling
As it is carrying the core-logic, we maybe better wrap the confluent kafka interfaces so that we can easily test them as well...? For example, https://github.com/SciCatProject/scicat-filewriter-ingest/blob/3c5c22ea4e606719e55b7999e407f18470b4e76c/online_ingestor.py#L71-L77 This kind of error-handling can be separated into a smaller function. Or Wrap this into a function or automate this somehow...? https://github.com/SciCatProject/scicat-filewriter-ingest/blob/3c5c22ea4e606719e55b7999e407f18470b4e76c/online_ingestor.py#L32-L40
Logging
Logging seems like to require some configuration steps but despite of the configuration the logging interface
logger.log(info/debug/error)
should be the same. But since the configuration of the logger requires file or link handling, it'd better be tested I guess...?Ingestor
This guy definitely seems like to need some chopping. https://github.com/SciCatProject/scicat-filewriter-ingest/blob/3c5c22ea4e606719e55b7999e407f18470b4e76c/ingestor_lib.py#L222
For example, I think this should be wrapped into a separate file handling interface so that we can test them. https://github.com/SciCatProject/scicat-filewriter-ingest/blob/3c5c22ea4e606719e55b7999e407f18470b4e76c/ingestor_lib.py#L258-L261
Redefining Interfaces
Nexus structure path
https://github.com/SciCatProject/scicat-filewriter-ingest/blob/3c5c22ea4e606719e55b7999e407f18470b4e76c/ingestor_lib.py#L28-L33
This mixture of
filter
andindexing
can maybe refactored into more explicit way...? But I also think it is already working conveniently already... hmm...Ingestor
Maybe not redefining, but just better type-hinting or wrapping some of the arguments into a tighter data structure. https://github.com/SciCatProject/scicat-filewriter-ingest/blob/3c5c22ea4e606719e55b7999e407f18470b4e76c/ingestor_lib.py#L222
Some of
if
statements can be merged into one if we letget_instrument
to acceptdefaultInstrument
likeget_nested_value_with_default
. https://github.com/SciCatProject/scicat-filewriter-ingest/blob/3c5c22ea4e606719e55b7999e407f18470b4e76c/ingestor_lib.py#L384-L392Documentation / Docstrings
Maybe we can collect the comments into docstrings. Copilot is good at it : D And the documentation in the readme might be sufficient but we probably need to clean that up too. Here is a list of some sections that might be needed in the documentation.