n8n-io / n8n

Free and source-available fair-code licensed workflow automation tool. Easily automate tasks across different services.
https://n8n.io
Other
45.7k stars 6.36k forks source link

IMAP node occasionaly times out and disables workflow #3607

Closed mark-monteiro closed 1 year ago

mark-monteiro commented 2 years ago

Describe the bug I have a workflow that is triggered by the IMAP node. It works correctly 90% of the time, but at least once per day, the node fails while checking for new emails with the following error:

n8n    | 2022-06-27T13:14:22.681Z | error    | Email Read Imap node encountered a connection error {"error":{"errno":-110,"code":"ETIMEDOUT","syscall":"read","source":"socket"},"file":"EmailReadImap.node.js"}

This causes the workflow to be de-activated and I have to manually re-activate it again for it to start working again. If I check out the workflow in the web UI this is the error displayed:

Problem activating workflow The following error occurred on workflow activation: read ETIMEDOUT

To Reproduce

I'm not sure how easy/difficult this will be to reproduce. The IMAP node is pointed at an inbox on a Google's Workspace account. After activating the workflow, it eventually stops working with the ETIMEDOUT error, usually within less than 24 hours.

The node from my workflow is included below:

{
  "nodes": [
    {
      "parameters": {
        "format": "resolved",
        "options": {
          "customEmailConfig": "[[\"UNSEEN\"], [\"FROM\", \"support@sosinventory.com\"], [\"TO\", \"sos-exports@sentisolutions.ca\"]]",
          "allowUnauthorizedCerts": true,
          "forceReconnect": 60
        }
      },
      "name": "IMAP Email - SOS Item Export",
      "type": "n8n-nodes-base.emailReadImap",
      "typeVersion": 1,
      "position": [
        740,
        680
      ],
      "retryOnFail": true,
      "notesInFlow": false,
      "waitBetweenTries": 5000,
      "credentials": {
        "imap": {
          "id": "1",
          "name": "Senti IMAP account"
        }
      }
    }
  ],
  "connections": {}
}

Expected behavior

I'm not sure what the correct fix for this is. I think in general that an intermittent timeout on connection should not disable the entire workflow. If this is not already done I think the node's "Retry on fail" settings should be respected so it can retry the connection. I didn't see any messages in the log file about the connection being retried so I assume that is not happening. In that scenario the "wait between tries" setting also seems very limited with a maximum of 5 seconds, it would be nice if it could be longer.

Environment (please complete the following information):

Additional context N/A

Joffcom commented 2 years ago

Hey @mark-monteiro,

I have just set up a test workflow that I will leave running to see if I can reproduce this issue.

Joffcom commented 2 years ago

Hey @mark-monteiro,

Quick update for you, It has not failed yet. I will continue to keep an eye on it.

mark-monteiro commented 2 years ago

Quick update on this:

Joffcom commented 2 years ago

Hey @mark-monteiro,

It is nice you have a workaround for it, I am still waiting for mine to error out.

rottmann commented 2 years ago

Can confirm IMAP issue. Don't know since which version the problem exists, but i used my skript since a year without any problem. The problem occurs in the last months, since i update n8n.

From time to time (currently once per week) the Workflow stop working and must manually be reactivated.

The other problem is, that the system not inform about a stopped workflow (e.g. by mail).

Could system stopped workflows not automatically restart after some time? Setting a variable with timeout would be nice.

Joffcom commented 2 years ago

@rottmann That is interesting I am still waiting for mine to stop, It has been over a month so far. I think issue appears to be environmental with the node failing which always makes it trickier to sort out.

Having the node disable if it can't make a connection may be more a safety feature than anything else to prevent it from causing a failure loop. Maybe a quick way to get notifications on the status would be to use an error workflow that should let you know if a workflow has failed.

Another possible solution could be to enable the Retry on Fail option for the node and set it to something like 5 max tries with a 5 second delay or tell it to continue on fail and do an If node that checks if there is any response before moving on.

Out of interest what mail provider are you using? I have 3 triggers currently running one using Gmail, one on Google Workplace (on the off chance there is a difference) and a GMX mailbox.

rottmann commented 2 years ago

@Joffcom We had our own mailserver. The server sends and receives every 5 minutes testmails without problem. n8n waits for new mails and process them. It reconnects every 5 minutes.

In n8n log-file we had last night many entries at 23:45 with "Initializing active workflow ..." And one "Unable to initialize workflow ...", without any further data.

I checked the mailserver log at that time, and it seems, that the n8n workflow still reconnects every 5 Minutes But this should not be possible, cause n8n stop the workflow.

Is it possible, that the old n8n workflow was shutdown incorrectly and could then not restart?

mark-monteiro commented 2 years ago

Another possible solution could be to enable the Retry on Fail option for the node and set it to something like 5 max tries with a 5 second delay or tell it to continue on fail and do an If node that checks if there is any response before moving on.

@Joffcom I already tried this and the retry logic does not do anything. I do not think the retry logic is applied to the part of the code where the IMAP connection is established

@rottmann Included below is the workflow I am using currently to work around this issue. It de-activates then re-activates the workflow whenever a trigger error like this one occurs (NOTE: You'll need to set your own URL and n8n API key for the two HTTP request nodes). It also send an error message notification to Google Chat, which you can remove or replace with your preferred notification.

{
  "nodes": [
    {
      "parameters": {},
      "name": "Error Trigger",
      "type": "n8n-nodes-base.errorTrigger",
      "typeVersion": 1,
      "position": [
        -460,
        880
      ]
    },
    {
      "parameters": {
        "spaceId": "spaces/AAAATTakZLA",
        "messageUi": {
          "text": "=<users/all>: {{$json[\"errorMessage\"]}}"
        },
        "additionalFields": {}
      },
      "name": "Google Chat Error Message",
      "type": "n8n-nodes-base.googleChat",
      "typeVersion": 1,
      "position": [
        440,
        880
      ],
      "credentials": {
        "googleApi": {
          "id": "10",
          "name": "Google Chat Service Account"
        }
      }
    },
    {
      "parameters": {
        "requestMethod": "POST",
        "url": "http://localhost:5675/api/v1/workflows/10/activate",
        "options": {},
        "headerParametersUi": {
          "parameter": [
            {
              "name": "X-N8N-API-KEY",
              "value": "xxxxxxxxxxxxxxxxxx"
            }
          ]
        }
      },
      "name": "Re-Activate Workflow",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 2,
      "position": [
        640,
        600
      ],
      "notesInFlow": true,
      "notes": "Workaround for https://github.com/n8n-io/n8n/issues/3607"
    },
    {
      "parameters": {
        "amount": 5,
        "unit": "seconds"
      },
      "name": "Wait 5 Seconds",
      "type": "n8n-nodes-base.wait",
      "typeVersion": 1,
      "position": [
        200,
        600
      ],
      "webhookId": "c714b235-ff29-4d9d-8cbc-9a773f3244a1"
    },
    {
      "parameters": {
        "spaceId": "spaces/AAAATTakZLA",
        "messageUi": {
          "text": "=The n8n workflow has been successfully re-activated after the error."
        },
        "additionalFields": {}
      },
      "name": "Google Chat Re-Activation Message",
      "type": "n8n-nodes-base.googleChat",
      "typeVersion": 1,
      "position": [
        860,
        600
      ],
      "credentials": {
        "googleApi": {
          "id": "10",
          "name": "Google Chat Service Account"
        }
      }
    },
    {
      "parameters": {
        "functionCode": "// Code here will run once per input item.\n// More info and help: https://docs.n8n.io/nodes/n8n-nodes-base.functionItem\n// Tip: You can use luxon for dates and $jmespath for querying JSON structures\n\n// Add a new field called 'errorType' to the JSON of the item\nif (item.execution) {\n  item.errorType = 'execution'\n} else if (item.trigger) {\n  item.errorType = 'trigger'\n} else {\n  throw new Error(\"Could not identify error type for item: \" + JSON.stringify(item));\n}\n\nreturn item;"
      },
      "name": "Determine Error Type",
      "type": "n8n-nodes-base.functionItem",
      "typeVersion": 1,
      "position": [
        -240,
        880
      ]
    },
    {
      "parameters": {
        "conditions": {
          "string": [
            {
              "value1": "={{$json[\"errorType\"]}}",
              "value2": "trigger"
            }
          ]
        }
      },
      "name": "If Trigger Error",
      "type": "n8n-nodes-base.if",
      "typeVersion": 1,
      "position": [
        -40,
        880
      ]
    },
    {
      "parameters": {
        "values": {
          "string": [
            {
              "name": "errorMessage",
              "value": "=An error occurred when triggering n8n workflow '{{$json[\"workflow\"][\"name\"]}}' from node '{{$json[\"trigger\"][\"error\"][\"node\"][\"name\"]}}'\n*Message:* {{$json[\"trigger\"][\"error\"][\"message\"]}}\n*Cause:* {{$json[\"trigger\"][\"error\"][\"cause\"][\"message\"]}}"
            }
          ]
        },
        "options": {}
      },
      "name": "Set Trigger Error Message",
      "type": "n8n-nodes-base.set",
      "typeVersion": 1,
      "position": [
        200,
        780
      ]
    },
    {
      "parameters": {
        "values": {
          "string": [
            {
              "name": "errorMessage",
              "value": "=An error occurred in n8n workflow '{{$json[\"workflow\"][\"name\"]}}' in node '{{$json[\"execution\"][\"lastNodeExecuted\"]}}'\n*Message:* {{$json[\"execution\"][\"error\"][\"message\"]}}\n*Execution Url:* {{$json[\"execution\"][\"url\"]}}"
            }
          ]
        },
        "options": {}
      },
      "name": "Set Workflow Error Message",
      "type": "n8n-nodes-base.set",
      "typeVersion": 1,
      "position": [
        200,
        960
      ]
    },
    {
      "parameters": {
        "requestMethod": "POST",
        "url": "http://localhost:5675/api/v1/workflows/10/deactivate",
        "options": {},
        "headerParametersUi": {
          "parameter": [
            {
              "name": "X-N8N-API-KEY",
              "value": "xxxxxxxxxxxx"
            }
          ]
        }
      },
      "name": "Deactivate Workflow",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 2,
      "position": [
        420,
        600
      ],
      "notesInFlow": true,
      "notes": "Workaround for https://github.com/n8n-io/n8n/issues/3607"
    }
  ],
  "connections": {
    "Error Trigger": {
      "main": [
        [
          {
            "node": "Determine Error Type",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Re-Activate Workflow": {
      "main": [
        [
          {
            "node": "Google Chat Re-Activation Message",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Wait 5 Seconds": {
      "main": [
        [
          {
            "node": "Deactivate Workflow",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Determine Error Type": {
      "main": [
        [
          {
            "node": "If Trigger Error",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "If Trigger Error": {
      "main": [
        [
          {
            "node": "Set Trigger Error Message",
            "type": "main",
            "index": 0
          },
          {
            "node": "Wait 5 Seconds",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Set Workflow Error Message",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Set Trigger Error Message": {
      "main": [
        [
          {
            "node": "Google Chat Error Message",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Set Workflow Error Message": {
      "main": [
        [
          {
            "node": "Google Chat Error Message",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Deactivate Workflow": {
      "main": [
        [
          {
            "node": "Re-Activate Workflow",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}
rottmann commented 2 years ago

@mark-monteiro Thanks, but this only help with an error triggered by a workflow.

In my case, the n8n system has an error, not the workflow.

My Problem is, that n8n randomly not restart a workflow, which n8n stopped before ('Unable to initialize workflow').

I have a common trigger for all errors (similar to yours, but simpler), this trigger will not trigger in that case.

I think it has to do with the "EmailReadImap" node, that prevent the workflow from being stopped internally, when n8n re-initialize all workflows (after some log cleanup? sqlite optimization? don't know). The workflow is shown as stopped, but EmailReadImap reconnects every 5 Minutes to our server, so it must run in background. (i see the reconnects in the mail server logfiles).

janober commented 2 years ago

In my case, the n8n system has an error, not the workflow.

Not sure I understand that part. If n8n can not activate a workflow, or if a workflow gets disabled because of a problem, it will run the Error workflow (at least if you are using an up-to-date version of n8n).

mark-monteiro commented 2 years ago

@rottmann Thanks for the clarification. It sounds like this is a different problem than the one I have. I think opening a separate issue might be appropriate where you can include all the necessary information requested in the issue template

rottmann commented 2 years ago

Ok, i opened a separate issue https://github.com/n8n-io/n8n/issues/3794, seems that it is a different problem.

Joffcom commented 1 year ago

Hey @mark-monteiro,

We did an update on the IMAP towards the end of last year that fixes this, If the IMAP node times out now we will reconnect automatically. To get the change it should just be a case of updating, As we believe this is resolved I am going to mark this one as closed. If you are still seeing this issue let me know and we can open it again.

mark-monteiro commented 1 year ago

@Joffcom Unfortunately it doesn't look like this was fixed. I updated to 0.211.2 this morning and I've already seen this come up again. Here's what it looks like in the logs now. It doesn't appear like an attempt was made to reconnect after the error.

n8n                | 2023-01-19T16:29:35.476Z | error    | Email Read Imap node encountered a connection error "{\n  error: Error: read ETIMEDOUT\n      at TCP.onStreamRead (node:internal/stream_base_commons:217:20)\n      at TCP.callbackTrampoline (node:internal/async_hooks:130:17) {\n    errno: -110,\n    code: 'ETIMEDOUT',\n    syscall: 'read',\n    source: 'socket'\n  },\n  file: 'EmailReadImapV1.node.js'\n}"
janober commented 1 year ago

Thanks for reporting back @mark-monteiro ! Reopening issue.

Joffcom commented 1 year ago

Hey @mark-monteiro,

In theory the reconnect should be fixed in the v1 node as well but as a test could you add in a new IMAP node which will use the updated v2 node and see if that works?

Joffcom commented 1 year ago

Hey @mark-monteiro,

Any luck when adding a new IMAP node?

mark-monteiro commented 1 year ago

Sorry @Joffcom haven't found the time to update my workflows yet. I'm hoping to get to it some time this week, and will let you know how it goes 👍

mark-monteiro commented 1 year ago

I have updated my workflow and will let you know if I see the issue again. It usually pops up several times per day so I should have an answer either way within a few days.

mark-monteiro commented 1 year ago

Two weeks without issue, looks like this can be closed. Thanks for the assistance!