AntonioMrtz commented 3 months ago

Description

The following is wanted for improving backend logging:

Improve logging for events, operations, requests and so.
Maintaining consistency between all logs is also wanted.
Search for missing logs that can be added and can be useful.

The main goal for logging in the project was to detect app behaviour and errors without getting even into the code. The logs should be concise, clear and representative enough to achieve that goal. Implement this issue with that main goal in mind, as it was done before on backend refactor branch.

Context

The branch for refactoring backend was long and took several months, this caused logs to be inconsistent and sometimes incorrect.

How to do it

Take a look at backend and search for improvements & errors associated with logging

Testing

AntonioMrtz commented 3 months ago

Improve Backend Logging

aarshgupta24 commented 3 months ago

hello, i would like to assist in this issue though I'm new to open source stuff but I know the basics of logging

AntonioMrtz commented 3 months ago

Hi @aarshgupta24. All yours, let me know if you need anything.

AntonioMrtz commented 3 months ago

Hi @aarshgupta24, I made some backend changes during the wekeed. They should impact much on your task but consider keeping your branch up to date 😃

aarshgupta24 commented 3 months ago

got it @AntonioMrtz

aarshgupta24 commented 3 months ago

Hi @AntonioMrtz , I have a doubt like there in no logging constant defined for login folder in spotify_electron , so should I initialize one or there is no need for that folder to have a logger? And generally also should i focus mainly on finding where can i add log messages or also like where logger is not defined? Also should I also add info logs( i'm trying not to as they increase the verbosity) in majority of place?

AntonioMrtz commented 3 months ago

Hi @aarshgupta24 , I dont know if I understood correctly but i will try to answer:

Logging constants are associated to concrete loggers ( Theyre some kind of ID for the logger). The purpose is showing which logger printed the statement. Example:
```
genre_service_logger = SpotifyElectronLogger(LOGGING_GENRE_SERVICE).getLogger()
```
By establishing LOGGING_GENRE_SERVICE as the name of the logger the prints that the logger executes wil show the name of the logger. This helps to identify the context and location of the log.

As you see logging constants are not associated with packages/folders but with modules/files/classes.

The main goal of logging in the app is to trace every process happening. By only looking at logs we should be able to know what happen at everytime in the app, including debugging/locating unexpected errors/app failures. This concept is a bit broad but by focusing on it I think you can solve most of the questions you have. Apart from it, I will be looking for the following:
Look for inconsistencies. Example: one get method from a repository prints x with y format and its not similar across other repository get methods.
Look for missing logs that can be useful to have.
Look for logs that could be better at a different log level. Example: debug->info

Keep in mind that log level (debug,info...) matters. We dont want the app on info mode to be full of logs but they should be enough to trace whats happening. In debug mode you can have more logs that shows attributes or other meaninful data. Example:

def get_all_playlists() -> list[PlaylistDAO]:
    """Get all playlist

    Raises
    ------
        PlaylistRepositoryException: an error occurred while\
              getting all playlists from database

    Returns
    -------
        list[PlaylistDAO]: the list whith all the playlists

    """
    try:
        collection = get_playlist_collection()
        playlists_files = collection.find()
        playlists = [
            get_playlist_dao_from_document(playlist_file) for playlist_file in playlists_files
        ]
    except Exception as exception:
        playlist_repository_logger.exception("Error getting all Playlists from database")
        raise PlaylistRepositoryException from exception
    else:
        playlist_repository_logger.info(f"All playlists obtained : {playlists}") -> this prints playlist with all the attributes, maybe we can print the names of the obtained playlist on info, and the playlist data on debug
        return playlists

Im not close on how logging should be done and what content should it have so I think the best will be to make the improvements/changes on one entity of the app such as playlist. Then I can review the changes and we can figure it out what we like/dont like and establish some kind of standard to apply on the rest of the modules. Making a proposal for Playlist entity (repo,schema,controller,service) sounds interesting as its the base/default entity of the project and contains common and specific logic.

About your last question, I would try to keep a similar structure, I dont think theres a need to add a ton of new logs but maybe Im wrong. By reviewing the proposed changes on playlist logging we can get closer and closer to what we actually want.

Sorry about the size of the answer 🤣 but I think it can be useful.

aarshgupta24 commented 3 months ago

thank you @AntonioMrtz , I was a little nervous as this is my time contributing and I was also thinking of asking you to review my work so that I don't mess the the things. I will create a proposal and then we can discuss the structure further.

aarshgupta24 commented 3 months ago

Hi @AntonioMrtz , I am attaching the proposal in this comment. Untitled document.docx

AntonioMrtz commented 3 months ago

Hi @aarshgupta24, thanks for your time and your proposed ideas.I took a quick look and It looks good. I will check it out more closely and give you feedback in 1-2 days max. 🙂

AntonioMrtz commented 3 months ago

Hi, I analized the proposal and now I can defined more precisely what were searching for and how it should be done. Lets keep this an interative process until we find a definitive solution.

Basics

Do you have Ruff linter active? It has exception managment and logging rules that were following and are the base of the application development. Its helpful to have it on so we can follow the same baselines.
I tried to define the task but I think it wont be a bad idea to split it in smaller tasks because the scope has grown too wide.
Logging the exception in logger.exception is redudant. It already prints the exception trace along with the message. Check this for more info.

Logging Proposal

Common logging templates

Mainting consistency and having no code repetition is my biggest concern because it allows the code to be maintaned way easily. I think by using common string templates we can overcome the problem.
Common logs should be place into template strings and for logging only the parameters to fill the string are needed.
Candidates for being refactor into a template should be the ones like (but basicly most of the logging strings could be refactored):
```
Unexpected error in Song Repository deleting song : {name}" // Replace 
"Unexpected error in Song Service deleting song : {name}"
```

f"Song not found : {name} -> Replace Song and name for placeholders Bad Song Name Parameter : {name} -> Replace Song and name for placeholders "User {artist_name} not found -> Replace User and artist_name for placeholders

Something like this could be good. Templates should be stored in another module.

template = "{entity_name}not found: {item_name}" formatted_string = template.format(entity_name="Song",item_name="song_name")


## Controller

- I dont see a benefit of logging exceptions here. Depending on the raised exception in the service layer a custom response with the associated error will be sent to the client. Controller should be (imo) left without business logic. Keeping the logic inside the controller small will lead us to use the controller only for calling service layer and answering the client via HTTP.
- As said previously logging in the controller while were sending error status codes and messages with some info feel redundant.

## Repository

- Should be in debug mode because theyre logging to much information in a frequent rate.
- Maybe we should be using an info statement that prints if the operation was succesful without printing to much data and another debug statement for logging the whole object.

def get_playlist( name: str, ) -> PlaylistDAO: try: collection = get_playlist_collection() playlist = collection.find_one({"name": name}) handle_playlist_exists(playlist) playlist_dao = get_playlist_dao_from_document(playlist) # type: ignore

except PlaylistNotFoundException as exception:
    raise PlaylistNotFoundException from exception

except Exception as exception:
    playlist_repository_logger.exception(f"Error getting Playlist {name} from database")
    raise PlaylistRepositoryException from exception
else:
    playlist_repository_logger.info(f"Get Playlist {name} successful")
    playlist_repository_logger.debug(f"Get Playlist by name returned {playlist_dao}") -> DEBUG PRINT whole object
    return playlist_dao



## Service

- I would just check the messages and info that are being printed. The schema should remain the same but only changing the messages
- Using the info/debug schema shown in repository could also be helpful

## Validations checks

- I dont think logs are needed here because theyre already handled by the service layers. We should unify the comments/code so its reusable and consistent.

## Schema

- I like it but it should be in debug mode. Convertion from DAO to DTO and from Document to DAO occurs to frequent and the log size will be too big (logging attributes).

## More descriptive exceptions

As shown [here](https://docs.astral.sh/ruff/rules/raise-vanilla-args/) its not recommended to throw exceptions with a message. This can lead to the same exception being raised with different messages across the app, losing traceability. I would like to know what raising exceptions with custom messages gives us if were already logging messages associated to the raised exceptions. Im not close on introducing it but Im not really sure it will have an advantage.

## General guidelines

- NEVER log passwords, even if theyre hashed. We have to be careful with logging `Users`, `Artist` or other children classes from `User` before actually converting the object from DAO to DTO. (In DTO were not storing the password).

## New tasks (Out of scope for this issue)

- Validations methods should be consistent. Rigth now some of them are called handle and others validate. I will stick with `validate_*` for this kind of operation.
- Add `logger.warning` to events that can lead to future problems. [Providers](https://github.com/AntonioMrtz/SpotifyElectron/blob/master/Backend/app/spotify_electron/song/providers/song_service_provider.py) or mappers are a clear example, warn if theres no mapping available and the default is being loaded.
- We should add some kind of correlation ID for operations that are related and ocurred one after another in a single thread. This can be useful when handling multiple request at the same time.

Again, thanks for your time. Im open to talk about any question/suggestion but I think we can use this schema for a first approach/iteration of the logging issue were facing. The next phase can be putting everything we have got into a conclussion in this thread in the actual `playlist entity` in the codebase. Then we can test and keep what we like/ discard what we dont. If you have any questions/suggestions just tell me, I will be happy to answer.

aarshgupta24 commented 3 months ago

hi @AntonioMrtz , thank you for this now i will try to keep all this in my mind, i will first implement the schema part just i one thing that after the logging message should raise exception(without the message) or not ?

AntonioMrtz commented 3 months ago

hi @aarshgupta24 . Yeah I think we should. The exception will be catch by the big except Exception on repository and service giving us more info about what happen. I thinks this is a good way because the exception handling will be the same for a 500 error and trying to converto document-dao-dto. In exception blocks make sure to use (ConvertionException,Exception), is redundant but is easier to see whats getting catch. Log convertion errors in exception mode (the debug mode I told you was for sucessful operation but I dont thinks its necessary) and then raise the convertion error.

AntonioMrtz commented 3 months ago

Hi @aarshgupta24 , the issue #164 and #165 are related to this task. Maybe you want to tackle them first. This tasks can be made before this issue because theyre a basicly subtask.

aarshgupta24 commented 3 months ago

hi @AntonioMrtz , i will go through both of them and let you know and revert back you on 14th as i have a interview scheduled on 13th

AntonioMrtz commented 3 months ago

@aarshgupta24 Yeah no problem. Lets keep in touch. 😃

aarshgupta24 commented 3 months ago

Hi @AntonioMrtz , I think i will first handle the issue #164.

aarshgupta24 commented 3 months ago

if im correct in issue #164 , the file playlist_repositry_validation.py we have used handle and in playlist_service_validation.py we have used validate, so you want me to change handle to validate and also rather than an if statement replace it with try-except block?

AntonioMrtz commented 3 months ago

Hi @AntonioMrtz , I think i will first handle the issue #164.

Hi, I think you have to comment in #164 so I can add you as assignee.

Rename all validations to the same name pattern validate_x. I dont know how a try/catch will fit the code, were checking with if statements and then throwing exceptions to the higher levels. Maybe I didnt understand it correctly. Correct me if Im wrong. As far as I know structure has to be:

if not validation_condition:
    return CustomException

AntonioMrtz commented 1 month ago

Hi @aarshgupta24 are you still working on this issue?

aarshgupta24 commented 1 month ago

hi @AntonioMrtz , i'm really sorry my college opened up and since then i'm really having a very hard time figuring out time for this. I'm sorry but i don't think so i will be able to complete this issue

AntonioMrtz commented 1 month ago

Hi @aarshgupta24 thanks for all your work in this project. Feel free to come back and contribute anytime you want 😃

coyster commented 1 month ago

Hey @AntonioMrtz would you mind if I take a crack at this?

Proposals:

Add %(funcName)s and %(module)s to the debug, error, and critical format in SpotifyElectronFormatter -> FORMATS, makes the logs much easier to read
Move the exception logs from the function to the exception class. This would clean up the code a bit and keep exception logging centralized. Since exceptions come with the stack trace it'll still show where it was raised
- Could put in an arg into the exception to capture info

I can also start working on the other mentioned issues:

Adding logging template function
Refactor f-strings into templates
Add / refactor logging in the playlist files to start

AntonioMrtz commented 4 weeks ago

Hi @coyster , nice to meet you. Yeah, thanks for taking the issue. Let me know if you need anything while working on this issue :). We also have the project docs that gathers all the info, methodology and common questions. Feel free to check it out if you wanna know about the project and how we colaborate.

🎯 Objectives

What we want to achieve:

Reduce quantity of code
Code reusal and centralization
Consistency and clarity
Traceability
Readable and easy to trace errors

Optional(can be done in this issue or not)

Implement uuid tracing for the whole lifecycle of an action. Example:

Frontend: get playlist -> Backend: controller getting request -> service -> repo -> service -> controller -> Frontend: recieve feedback

It will be nice to have a custom header for frontend requests that has an uuid for tracing all the steps related to an operation. Backend would take the uuid and use it to log the steps made. If this is too much we can leave it for another issue, I would take part of it for the front side if you're OK with that.

✨ Current Situation

I think it's important to know how the app behaves right now so we can tackle the issue more accurately. Let's take trying to get a non existent playlist as an example.

INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
2024-09-20 08:41:06,102 -  AUTH_SERVICE - INFO - Token data: TokenData(username='user-test', role='user', token_type='bearer')
2024-09-20 08:41:06,106 -  PLAYLIST_SERVICE - ERROR - Playlist not found: playlist-test
Traceback (most recent call last):
  File "/workspaces/SpotifyElectron/Backend/app/spotify_electron/playlist/playlist_repository.py", line 83, in get_playlist
    validate_playlist_exists(playlist)
  File "/workspaces/SpotifyElectron/Backend/app/spotify_electron/playlist/validations/playlist_repository_validations.py", line 29, in validate_playlist_exists
    raise PlaylistNotFoundException
app.spotify_electron.playlist.playlist_schema.PlaylistNotFoundException: Playlist not found

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/workspaces/SpotifyElectron/Backend/app/spotify_electron/playlist/playlist_service.py", line 70, in get_playlist
    playlist = playlist_repository.get_playlist(name)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspaces/SpotifyElectron/Backend/app/spotify_electron/playlist/playlist_repository.py", line 87, in get_playlist
    raise PlaylistNotFoundException from exception
app.spotify_electron.playlist.playlist_schema.PlaylistNotFoundException: Playlist not found
INFO:     127.0.0.1:50336 - "GET /playlists/playlist-test HTTP/1.1" 404 Not Found

(1) As you can see we have traces about the all functions and modules that takes part on the operation and handles the incoming request. I'm not opposed to the idea of moving into templates either.

(2) Having logs into the exceptions was something I thought about when backend refactor occured. It can be a great idea. Code reusability and consistency could be achieved by following your second point, I'm definitely in favour of that.

🔨 How to proceed

I think we can test a mix of your ideas and my feedback in a little prototype and iterate over that. We could start by making a code proposal for only getting a playlist that doesn't exists, like in the example above. Once we have a clearly defined line of work we can use this operation for iterating purposes

coyster commented 3 weeks ago

@AntonioMrtz Code change / sow is attached Code Change SOW.md

See Question for pydantic model for incoming header and I'm leaning toward option 1 in 'Add UUID to log' section

Let me know what you think, and I'll start working

AntonioMrtz commented 3 weeks ago

Frontend-Backend correlation id

AntonioMrtz commented 3 weeks ago

Hi @coyster , I wrote a response one day ago but it didnt get sent jajajaja. I will answer you again asap

@AntonioMrtz Code change / sow is attached Code Change SOW.md

See Question for pydantic model for incoming header and I'm leaning toward option 1 in 'Add UUID to log' section

Let me know what you think, and I'll start working

AntonioMrtz commented 3 weeks ago

Hi again @coyster, first of all thanks for your time.

(3) UUID

I'm using this middleware at my current job https://github.com/snok/asgi-correlation-id. I think it can get the job done and we only will have to modify the format of the logging messages at base SpotifyLogger. Prop drilling request info from the endpoints doesn't sound good at all.Tell me if the middleware sounds good to you.

(1) (2) (4) Exception logging

I like the extra args method for giving context to the exception. If we combine that with using pre-defined messages we can achieve readability and reusability

try:
    ...
except Exception as exception:
    raise PlaylistRepositoryException(
        message_key=message_storage.CREATION_ERROR, # "Error creating playlist"
        playlist_name="My Cool Playlist",  -> context for item that failed
    ) from exception

In the exception class we will have to handle the extra args
We can reuse messages all over the app
Move logging into exception classes

class CustomException(Base:
     log("Repository exception") + extra args given

Tell me if you have any questions, I wrote it from memory since so it may not be as exact as the message that didn't get sent :)

Maybe we can start the PR by doing this refactor to playlist related and global stuff. This way we can iterate over the code until we're certain about how we exactly want to organize everything

AntonioMrtz commented 2 weeks ago

Hi again @coyster, another requirement for the task will be selecting the correct log level across layers such as repository and service. Having a consistent log level is key, I notice there're some repository methods that log on info on success while other do it on debug. Others directly don't log anything (they should).

AntonioMrtz / SpotifyElectron

Improve Backend Logging #160

Description

Context

How to do it

Testing

Basics

Logging Proposal

Common logging templates

🎯 Objectives

Optional(can be done in this issue or not)

✨ Current Situation

🔨 How to proceed

(3) UUID

(1) (2) (4) Exception logging