New command: Export Microsoft Teams messages for the given user or team

rabwill commented 3 years ago

Usage

m365 teams message export [options]

Description

Export Microsoft Teams chat messages for a given user, or a team.

Options

Option	Description
`--userId [userId]`	The Id of the user whose messages are to be exported. Specify either `userId`, `userName` or `teamId`
`--userName [userName]`	The UPN of the user whose messages are to be exported. Specify either `userId`, `userName` or `teamId`.
`--teamId [teamId]`	The Id of the team of which the messages are to be exported. Specify either `userId`, `userName` or `teamId`.
`--fromDateTime[fromDateTime]`	The start of time range to query
`--toDateTime[toDateTime]`	The end of time range to query
`-l, --licenseModel [licenseModel]`	License requirements are applicable on the protected Teams export API. Allowed values are `modelA`, `modelB`, `evaluation`. The default is evaluation. Read more about the license requirements.
-a, --withAttachments	When specified, the command will also download all chat attachments. Use `folderPath` as well to set the rootfolder.
`-p, --folderPath [folderPath]`	The folder path to save downloaded attachments to when using `withAttachments`.

Remarks

Microsoft Teams APIs in Microsoft Graph that access sensitive data are considered protected APIs. Export APIs require that you have additional validation, beyond permissions and consent, before you can use them. To request access to these protected APIs, complete the request form

Additional Information

https://docs.microsoft.com/en-us/microsoftteams/export-teams-content#how-to-access-teams-export-apis

Prerequisites to access this API

https://docs.microsoft.com/en-us/microsoftteams/export-teams-content#prerequisites-to-access-teams-export-apis

rabwill commented 3 years ago

@pnp/cli-for-microsoft-365-maintainers here is a new command for review of spec, please let me know if this is okay and then we can put it in the backlog and label it appropriately. Thanks 😊

waldekmastykarz commented 3 years ago

Thanks for the suggestion.

I wouldn't call the command export. As it retrieves all items, we should name it Get all Microsoft Teams messages for the given user
To make the command easier to use, I'd suggest that we allow specifying user's UPN rather than ID, which users would need to retrieve separately first
Are filter, pageSize and pageNumber meant to be required? If not, let's denote them as optional using [] rather than <>
What can be specified in the filter?
What is the default behavior if pageSize and pageNumber are not specified?

rabwill commented 3 years ago

Good catch on point # 3! thanks @waldekmastykarz. In the filter they can specify the date range or messageType etc. See sample response here

{
"id": "string (identifier)",
"replyToId": "string (identifier)",
"from": {"@odata.type": "microsoft.graph.identitySet"},
"etag": "string",
"messageType": "string",
"createdDateTime": "string (timestamp)",
"lastModifiedDateTime": "string (timestamp)",
"deletedDateTime": "string (timestamp)",
"subject": "string",
"from": {
                "application": null,
                "device": null,
                "conversation": null,
                "user": {
                    "id": "string (identifier)",
                    "displayName": "string",
                    "userIdentityType": "aadUser"                }
            },
"body": {"@odata.type": "microsoft.graph.itemBody"},
"summary": "string",
"chatId": "string (identifier)"
"attachments": \[{"@odata.type": "microsoft.graph.chatMessageAttachment"}\],
"mentions": \[{"@odata.type": "microsoft.graph.chatMessageMention"}\],
"importance": "string",
"locale": "string",
}

The API gives back @odata.nextlink in case there are multiple values similar to other commands or this sort.

plamber commented 3 years ago

Not sure if we will be able to use this command at a large scale. I fell into this already a couple of times with some other interesting endpoints.

Microsoft Teams APIs in Microsoft Graph that access sensitive data are considered protected APIs. Export APIs require that you have additional validation, beyond permissions and consent, before you can use them. To request access to these protected APIs, complete the request form.

The command will throw a 401 exception until you got an approval from Microsoft and your AppId has been enabled. @waldekmastykarz I do not believe that MS will enable our PnP CLI app with these permissions due to security reasons. I tried to get the approval already a couple of times but failed in the past.

Is this something we want to clarify before? Maybe have some discussions with the Microsoft responsible parties to see how we can proceed here?

waldekmastykarz commented 3 years ago

@rabwill rather than having a catch-all filter property, I'd suggest we introduce specific options for the specific values so that we can apply validation to it.

@plamber I wouldn't expect getting clearance for our app, especially as it doesn't support app-only auth. Instead, users would get clearance for their own app and use it in combination with CLI. In the past, I requested and got approval so the process is working given proper justification. I agree that the required clearance will likely limit the usage of this command. Perhaps if we can describe a number of usage scenarios and the justification, it could help users get approvals and use this command.

rabwill commented 3 years ago

@waldekmastykarz so just the date range to start off now, and we can add any as we please?

waldekmastykarz commented 3 years ago

@rabwill is date the most common filter?

rabwill commented 3 years ago

Since it is already mentioned as a filter example in the doco, perhaps it's a good start? ☺️

waldekmastykarz commented 3 years ago

Good point 👍

rabwill commented 3 years ago

Shall we move this to backlog then? Do we need more discussion @plamber @waldekmastykarz ?

waldekmastykarz commented 3 years ago

I think we're good to go 👍

dips365 commented 3 years ago

I would like to take this

garrytrinder commented 2 years ago

Opening up due to lack of activity

martinlingstuyl commented 2 years ago

We do have a command for listing messages nowadays

https://pnp.github.io/cli-microsoft365/cmd/teams/chat/chat-message-list/

How would this relate? Should we maybe review the specs?

waldekmastykarz commented 2 years ago

Following this spec, this command would use the Teams export APIs which are different than the Chat message list APIs. Chat message list APIs are not meant to be used at scale and could easily lead to throttling, especially on active tenants.

martinlingstuyl commented 2 years ago

I understand, but in that case I think we should review the command name and description. Otherwise it's like you have two commands for what seems the same func.

martinlingstuyl commented 2 years ago

For example teams chat message export?

martinlingstuyl commented 2 years ago

Oh wait, this is across all chats for the user. Never mind. Great spec!

martinlingstuyl commented 2 years ago

On...the...other...hand 🤗

The teams export api exposes two endpoints according to the docs, one for user chat messages and the other for channel messages.

If we where to add both as commands, we'd run into our existing channel message list.

To align it with our existing commands we could have two new commands:

teams chat message export (current issue) teams channel message export (to be created as issue)

This way they would align better and it would be clearer what the commands do.

I'd also vote for an additional option:

Option	Description
`--licenseModel [licenseModel]`	The license model to use when executing the command. Accepted values: `A/B/Evaluation`. Default value is Evaluation. Read more about licensing for export api's in the docs

waldekmastykarz commented 2 years ago

Great suggestion @martinlingstuyl! What if we'd combine them both under teams message export? We already have a message namespace in the CLI. Would that make sense or are the chat message and channel message APIs so different that it wouldn't make sense to combine them under message?

martinlingstuyl commented 2 years ago

I think the only thing that's different is the URL. So it would be possible to combine them.

Now you're saying it, we do not have teams channel message list, I was speaking too quickly. We have teams message list. All the teams message <verb> commands refer to channel messages.

You might debate whether it is logical to put a command in there that can export both channel and chat messages. On the other hand, the teams chat message namespace is all about the current user. While teams message is not really about the current user.

I think I like the idea. We do need different options:

Option	Description
`--userId [userId]`	The Id of the user whose messages are to be exported. Specify either `userId` or `channelId`.
`--channelId [channelId]`	The Id of the channel of which the messages are to be exported. Specify either `userId` or `channelId`.

I'd say we drop userName from the specs.

Adam-it commented 2 years ago

@martinlingstuyl I agree. Since we already have the m365 teams message list with --channelId option I would leave it and include a new option which is --userId. So I totally agree and feel it the same way as you described 👍. @pnp/cli-for-microsoft-365-maintainers any other comments? maybe we could open this up?... and maaaybe even as "good first issue" ?

martinlingstuyl commented 2 years ago

As spoken about above, We should update the spec a bit first:

teams message export [options]

martinlingstuyl commented 2 years ago

I updated it, what do you guys think?

Adam-it commented 2 years ago

@martinlingstuyl weren't we suppose to drop the userName option for now?

Besides this single comment all other seem ok and I think we may open this up. do you think this might be a "good first issue"?

martinlingstuyl commented 2 years ago

We have these double options userId and userName in other places as well. Hence I added them both. It's effectively the same thing when calling the graph api, but it's clearer to the user what he can use.

But before we open it up: we need some additional research here:

https://devblogs.microsoft.com/microsoft365dev/announcing-general-availability-of-microsoft-graph-export-api-for-microsoft-teams-messages/

We might need teamId instead or next to channelId.

And we certainly need a licenseModel option as well. We discussed that earlier and it is applicable on this api.

martinlingstuyl commented 2 years ago

Also: what about retrieving chat-attachments:

Message Attachments: Export APIs include the links to the attachments that are sent as part of messages. Using Export APIs you can retrieve the files attached in the messages.

We could add something of a switch to trigger behavior that downloads the attachments.

martinlingstuyl commented 2 years ago

Ok, I updated the specs a bit, adding the licenseModel option and teamId.

What do you guys think of auto exporting the attachments as well, (using --includeAttachments, and --filePath options) or should we leave that to the user, as the attachment urls are provided?

martinlingstuyl commented 2 years ago

I also did a check: it seems the endpoint getAllMessages can only be called on team level, for all channels. It seems to be unavailable on channel level.

I need to do a better test, but it seems we can drop channelId

Adam-it commented 2 years ago

good research @martinlingstuyl 👍. I think the --includeAttachments option would be ok and not exporting them by default.

martinlingstuyl commented 2 years ago

Thanks Adam,

Reading the documentation it's either :

Get all messages for a user across all chats Or get all messages in a team across all channels.

https://learn.microsoft.com/en-us/graph/api/channel-getallmessages?view=graph-rest-1.0&tabs=http

So I'll remove channelId.

Any thoughts from other @pnp/cli-for-microsoft-365-maintainers about the attachments?

waldekmastykarz commented 1 year ago

Looking at the different purpose and functionality (and caveats) of the export and the regular list API, I wonder if we should combine both commands after all or if we should rather keep them separate. My gut feeling says to keep them separate, because using the export API requires you to have a specific AAD app, which has been granted special API access and a license, whereas you don't need either for the regular list command, which isn't meant btw to be used at scale.

Adam-it commented 1 year ago

@waldekmastykarz this is a very valid comment 😯. @martinlingstuyl what do you think about making two separate commands? And maybe separate issues?

waldekmastykarz commented 1 year ago

We already have teams chat message list, so we'd just need one new command for export, which this issue is for, correct?

martinlingstuyl commented 1 year ago

Correct,

We have teams chat message list. The export should be a separate command.

I'm doubting though if we should name this command teams chat message export. Instead of teams message export. It would be more consistent with the other command.

What about the question around attachments. What do you think of that?

waldekmastykarz commented 1 year ago

I'm doubting though if we should name this command teams chat message export. Instead of teams message export. It would be more consistent with the other command.

teams message export seems more suiting indeed. Looking at the docs, it seems like you can't export messages from a specific chat, so adding chat in the command's name would be misleading.

What about the question around attachments. What do you think of that?

How about --withAttachments?

martinlingstuyl commented 1 year ago

Ok, Check the updated specs. How about that?

waldekmastykarz commented 1 year ago

I'd suggest we rename --filePath to --folderPath, since we want people to point to a folder rather than a file

waldekmastykarz commented 1 year ago

Do we need the pageSize and pageNumber options?

martinlingstuyl commented 1 year ago

Do we need the pageSize and pageNumber options?

We don't!

I'd suggest we rename --filePath to --folderPath, since we want people to point to a folder rather than a file

🤦‍♂️ OMC (Cussing for nerds 2.0: oh my code)

updated the specs...

Adam-it commented 1 year ago

@martinlingstuyl looks ok to me 👍

Adam-it commented 1 year ago

@waldekmastykarz any comments? it has been a long run 🙂 @martinlingstuyl should we open it up ?

MathijsVerbeeck commented 1 year ago

Would love to work on this one.

milanholemans commented 1 year ago

🧹time! All yours @MathijsVerbeeck!

MathijsVerbeeck commented 1 year ago

Just some feedback regarding this:

I've written the code, but am still struggling with something.

When retrieving the messages, we will get a JSON of the message with an attachment body like this: If we would then use --withAttachments, we would have to download this file. This cannot be done by directly using the contentUrl, but we could do this by using getFileByServerRelativePath.

I'm just still struggling with the inconsistency on how we would define our url, as, theoretically, it is possible to refer to a file on the root site collection, with would make the weburl hard to compose.

Not sure if any of you have any suggestions regarding this?

waldekmastykarz commented 1 year ago

I'm just still struggling with the inconsistency on how we would define our url, as, theoretically, it is possible to refer to a file on the root site collection, with would make the weburl hard to compose.

Not sure I understand the problem. If we split the URL into authority (protocol + domain) and the server-relative path, wouldn't that be enough to build a URL like https://contoso.sharepoint.com/_api/<the API call to get the URL by server-relative URL>?

MathijsVerbeeck commented 1 year ago

@waldekmastykarz But if, let's say, I have a file under /sites/testsite, I cannot retrieve this file using https://contoso.sharepoint.com/_api/web/getfilebyserverrelativepath(DecodedUrl='/sites/testsite/shared documents/file.docx'), as this will throw an error file does not exist. You have to do this specifically on the site collection. So the question is: How would I know what the sitecollection of which to retrieve the files would be.

waldekmastykarz commented 1 year ago

So the challenge is to find site collection URL from the full file URL? If so, it's vaguely familiar and I'm pretty sure we implemented it somewhere. Not sure though if it was for SharePoint or Graph APIs, but I'm pretty sure we have a solution for it in our code base.

MathijsVerbeeck commented 1 year ago

Exactly. I will have a look then 😄 Thanks for the feedback!

waldekmastykarz commented 1 year ago

Check out getGraphSiteInfoFromFullUrl in file convert pdf. Not sure if it's an exact match but it could help you get started.

pnp / cli-microsoft365