MangoTheCat / mailman

R package for a wrapper around the python mailbox module
Other
1 stars 2 forks source link

get_messages() error when get_payload() returns list rather than vector #1

Open dgarmat opened 6 years ago

dgarmat commented 6 years ago

Hey, nice article at: https://www.mango-solutions.com/blog/snakes-in-a-package-combining-python-and-r-with-reticulate Trying out the R wrapper package, I'm not able to get read_messages() to work without getting this error:

Error in result[i, ] <- fields : incorrect number of subscripts on matrix
In addition: Warning message:
In fields[number_of_columns] <- payload_with_body$get_payload() :
 number of items to replace is not a multiple of replacement length

Going line by line, the issue seems to come from get_messages() when it runs: result[i,] <- fields on i = 29 On i = 28, result changes from a matrix to a list probably because of this:

> length(payload_with_body$get_payload())
[1] 2

So somehow the issue is coming from this part when sometimes the payload returns a length 2 list instead of a length 1 character vector

    # now retrieve the body
    if(message$is_multipart()){
      # sometimes a message is split into sub-messages
      # through inspection we see the body is stored in the second sub-message
      payload_with_body <- message$get_payload(1L)
      # we convert the sub-message to string
      fields[number_of_columns] <- payload_with_body$get_payload()
dgarmat commented 6 years ago

Changing the code in the function to this worked

    # now retrieve the body
    if(message$is_multipart()){
      # sometimes a message is split into sub-messages
      # through inspection we see the body is stored in the second sub-message
      payload_with_body <- message$get_payload(1L)
      # we convert the sub-message to string
      if(typeof(payload_with_body$get_payload()) == "character"){
        fields[number_of_columns] <- payload_with_body$get_payload()  
      } else if(typeof(payload_with_body$get_payload()) == "list"){
        as.character(payload_with_body$get_payload()[[1]])
      }

But now stopping at message 1236 on a different issue

adfi commented 6 years ago

Thx @dgarmat for reporting this. The data that I tested this must've had a specific structure for multi-part messages. I haven't looked into all the possible structures.

Would it be possible for you to send me the mbox file? I don't mind implementing your fix but as you say you get different errors for different messages.

adfi commented 6 years ago

@dgarmat is this still an issue?

dgarmat commented 6 years ago

Hi Adnan, I'm able to work around it, but not wanting to send my mbox file because it's all my sent mail for the last several years :) I wonder about adding maybe a wanring if a file doesn't parse, rather than it being an error - there could always be a handful of unusual formats that fail.

dgarmat commented 6 years ago

Or by file not parsing, I mean a specific email

adfi commented 6 years ago

That's a good idea, I'll add it later this week.