vshymanskyy / TinyGSM

A small Arduino library for GSM modules, that just works
GNU Lesser General Public License v3.0
1.93k stars 715 forks source link

SIM7000: TLS support plus other minor improvements #507

Closed FStefanni closed 3 years ago

FStefanni commented 3 years ago

Hi,

I have based this pr on the great code by @marmotton and @javipelopi. I have just:

Now it works perfectly with MQTT both with and without TLS.

Fixes #437 Fixes #474 Fixes #495

Regards

FStefanni commented 3 years ago

Hi,

as usual, all check failed due to an error in the regression (not in the code).

Regards.

gsvitak commented 3 years ago

@FStefanni THANK YOU!!

One suggestion. can you please include a working sample in the examples/more directory similar to the SIM800.

https://github.com/vshymanskyy/TinyGSM/blob/master/examples/more/SIM800_SslSetCert/SIM800_SslSetCert.ino

marmotton commented 3 years ago

@FStefanni thanks !

I'm wondering if this should be a separate device as we'd be loosing the ability to use 8 muxes on unencrypted connections, and I'm also thinking that the SIP command set is more reliable if there is no need for SSL (I didn't really test this so I could be wrong).

FStefanni commented 3 years ago

Hi,

@gsvitak: I have written an example, it compiles, but unfortunately I cannot test it nor give more support on this (sorry but I am doing this for work and I am quite busy and this feature is not on my goal at the moment). The code is based on the SIM800 example and the guide by @marmotton (here). You can find the code here. Please remind that you should then send an AT command to tell to use the certificate (e.g. sendAT(GF("+CASSLCFG="), mux, ",CACERT,\"dstrootx3.crt\""); -- please refer to here to check where to insert it). If it works, or if someone want to contribute with fixes to make it work properly, I can include it in the pr, and also I can add an option to simplify the usage (i.e. to send the AT+CASSLCFG command).

@marmotton I understand your point, but in practical cases, I suppose 2 muxes should be enough, since we are working on embedded systems with few available resources. Or, from another point of view, using non-TLS connections on the web is strongly discouraged for security reasons, so I assume 98% of times TLS will be in use, and thus, we are forced with two muxes. So I preferred to keep the code simpler, by creating a unique device.

@javipelopi : this week I am going to perform few stress tests on the code, so I will check back your proposed change on the modemRead() method. I'll write here the results.

Regards

FStefanni commented 3 years ago

Hi,

some good news.

@gsvitak: I added a method (both on the secure client and the modem) to simplify the usage of certificates. It is setCertificate() and you can use it to tell the name of the certificate to be used by a connection -- so no more need to change the library code to call AT+CASSLCFG. Btw, did you tried the sketch to load the certificate? Can I add it to the pr?

@javipelopi: I tested the code and in the end, your fix seems useful: it is not completely clear to me when exactly it is required, but with it the MQTT connection is completely reliable. So I re-enabled it.

Regards

gsvitak commented 3 years ago

@FStefanni thank you!! This is huge for us. We just never got around to adding yet. We will try in the next 10 days and get back to you. The code looks good to me

marmotton commented 3 years ago

@FStefanni I guess your point is quite reasonable, TLS is used almost everywhere and I can imagine that only 1 mux is used in general. I tested (lightly) your code and it works well on my SIM7000E R1351 module. Thanks :)

FStefanni commented 3 years ago

Hi to all,

thank you for your support. I'll wait for possible feedback, and in the meantime, I'll use it in my project too (so I guess that if there is any issue, we will spot it very quickly).

Regards

FStefanni commented 3 years ago

Hi,

I have found an issue. At the moment I am not able to get a JSON.

For example, using the HttpClient to get worldtimeapi.org/api/timezone/Europe/Rome fails:

10:31:48.590 -> +CASEND: 0,0,2
10:31:48.624 -> AT+CASEND=0,2
10:31:48.726 -> 
10:31:48.726 -> >
10:31:48.726 ->  
10:31:48.861 -> OK
10:31:48.861 -> 
10:31:48.861 -> +CASEND: 0,0,2
10:31:48.861 -> AT+CARECV?
10:31:48.895 -> 
10:31:48.895 -> +CARECV: 0,0
10:31:48.895 -> +CARECV: 1,0
10:31:48.930 -> 
10:31:48.930 -> OK
10:31:49.919 -> AT+CARECV?
10:31:49.953 -> 
10:31:49.953 -> +CADATAIND: 0
10:31:49.953 -> 
10:31:49.953 -> +CASTATE: 0,0
10:31:49.953 -> [41182] ### Closed:  0
10:31:49.953 -> 
10:31:49.953 -> +CARECV: 0,0
10:31:49.953 -> +CARECV: 1,0
10:31:49.987 -> 
10:31:49.987 -> OK
10:31:50.974 -> AT+CASTATE?
10:31:51.008 -> 
10:31:51.008 -> +CASTATE: 0,0
10:31:51.008 -> +CASTATE: 1,0
10:31:51.042 -> 

Basically, the connection and send of the request succeed, but the read fails. This is because as soon as the server has sent the reply, it closes the connection, and therefore the modem flushes the receive buffer.

This can be seen clearly looking at the modem asynchronous messages:

I also tested changing the management of the asynchronous CASTATE, by not setting the connection info as closed (commenting line 814), and the difference is that instead of polling with AT+CASTATE?, the polling is done with AT+CAREAD?, but the reply reports no data to read on mux 0 (this is why I suppose that as soon as the connection closes, the modem flushes the recevie buffer). Here follows the log with this change:

0:42:04.165 -> AT+CASEND=0,2
10:42:04.268 -> 
10:42:04.268 -> >
10:42:04.268 ->  
10:42:04.405 -> OK
10:42:04.405 -> 
10:42:04.405 -> +CASEND: 0,0,2
10:42:04.405 -> AT+CARECV?
10:42:04.438 -> 
10:42:04.438 -> +CARECV: 0,0
10:42:04.438 -> +CARECV: 1,0
10:42:04.473 -> 
10:42:04.473 -> OK
10:42:05.460 -> AT+CARECV?
10:42:05.493 -> 
10:42:05.493 -> +CADATAIND: 0
10:42:05.493 -> 
10:42:05.493 -> +CASTATE: 0,0
10:42:05.493 -> [49074] ### Closed:  0
10:42:05.493 -> 
10:42:05.493 -> +CARECV: 0,0
10:42:05.493 -> +CARECV: 1,0
10:42:05.528 -> 
10:42:05.528 -> OK
10:42:06.546 -> AT+CARECV?
10:42:06.546 -> 
10:42:06.546 -> +CARECV: 0,0
10:42:06.580 -> +CARECV: 1,0
10:42:06.580 -> 
10:42:06.580 -> OK
10:42:07.599 -> AT+CARECV?
10:42:07.599 -> 
10:42:07.599 -> +CARECV: 0,0
10:42:07.632 -> +CARECV: 1,0

Can anyone help please?

Regards.

marmotton commented 3 years ago

Hi @FStefanni , I'm not sure I can really help, but I also observed this during my initial tests. I have the feeling that we cannot do anything apart from writing Connection: Keep-Alive in the HTTP request and hoping that the server doesn't close the connection too soon.

FStefanni commented 3 years ago

Hi @marmotton,

thank you for the quick reply. This could be a nice workaround: can you please provide a code snippet I can test? Is it an AT command or a parameter of HttpClient?

What make me think is that I do not understand why I have not this issue with SIM800. So I wonder if it is an issue of the sim7000 modem, or if we are doing something wrong with the AT commands.

Regards.

marmotton commented 3 years ago

@FStefanni I don't have a code example, I tried this by manually writing the HTTP request using AT commands.

I had a look at the HTTP client library, the method sendInitialHeaders() writes "Connection: close" automatically. https://github.com/amcewen/HttpClient/blob/4a2222d2107daf1ef03fc3b7e8e3381de88bfb71/HttpClient.cpp#L200

Instead of using the get() method you could write all the headers manually using sendHeader(). These headers should be sufficient:

GET /api/timezone/Europe/Rome HTTP/1.0
Host: worldtimeapi.org
Connection: keep-alive
FStefanni commented 3 years ago

Hi,

@marmotton ok thank you. I'll try and let you know.

Regards.

marmotton commented 3 years ago

@FStefanni good luck ! You also wrote about SIM800, as I recall from my tests this problem appeared with the new CA command set that we need to use for TLS on the SIM7000, I'm almost sure it works fine with the CIP commands albeit without TLS.

SRGDamia1 commented 3 years ago

Sorry, sorry, sorry! I haven't gotten to look at this yet. I've put it on my calendar!

And, yes, there's an issue with Travis right now so all the checks will always fail.

magillus commented 3 years ago

Thank you doing work on this. I work with @gsvitak on this and trying to work with PubSubClient (esp32 - Arduino framework) with SIM7000G. I use (mux) as 1, because using 0 run into errors when configuring SSL - that is why my comment above around sslversion. Also trying to connect to AWS IoT and here is AT commands that I see. in the end I got +CAOPEN: 1,11 and 11 error code is hard to find, because different versions of 7000 series at commands show different things. It could be timeout because it take some time sometimes to get that output. I try to use 1.06 command manual - http://www.mt-system.ru/sites/default/files/documents/sim7000_series_at_command_manual_v1.06.pdf#page=261

I wonder where I can find latest manual for 7000G?

AT output:

AT+CACLOSE=1

7m
AT+CACID=1

OK
AT+CSSLCFG="sslversion",1,3

OK
AT+CSSLCFG="ctxindex",1

+CSSLCFG: 1,3,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0,1,""

OK
AT+CASSLCFG=1,CACERT,"certificate.pem.crt"

OK
AT+CASSLCFG=1,ssl,1

OK
AT+CASSLCFG=1,protocol,0

OK
AT+CAOPEN=1,"our-test-url.us-west-2.amazonaws.com",8883
OVER-VOLTAGE WARNNING
OVER-VOLTAGE WARNNING
OVER-VOLTAGE WARNNING
OVER-VOLTAGE WARNNING

+CAOPEN: 1,11
magillus commented 3 years ago

@FStefanni I did additional tests and normal HTTP call failed to load on your PR code. Only difference is

   https://github.com/vshymanskyy/TinyGSM#master

vs

   https://github.com/FStefanni/TinyGSM#master

on platform.ini file (I use PlatformIO)

See attached files

FStefanni/TinyGSM#master: fail.connection.log.txt

original: vshymanskyy/TinyGSM#master success.connection.log.txt

I am happy to do more tests if needed, let me know.

FStefanni commented 3 years ago

Hi,

@magillus i try to summarize an answer

1.

I am not sure about your proposed change: "AT+CSSLCFG="sslversion" wants a ctxindex, not a mux (aka "cid" in the manual). So I am not sure that a mux/cid is the same of a ctxindex (and it should be not, since ctxindes is 0-5 and cid is 0-1 with ssl or 0-7 without ssl). For what I understand, it is just a index for a "memory" where to store some information. So are you sure that the two things are the same? Can someone enlighten us please?

2.

I have a 1.06 manual too, and it seems to be lacking of important things... I hope that a newer and better version will exists...

3.

I have no idea about the +CAOPEN: 1,11 error. From my test I have seen that sometimes my sim7000 has some issues at startup. So i have done a workaround:

  1. In case of any failing method call during startup, retry to call the failing method after a delay (2000-1000, depends on the failing method).
  2. Retry for the same method for a maximum of 3 times.
  3. In case of still failing after three times, call the routine to reset the modem to factory defaults, and restart the board (in my case, it is an ESP32)

I know this is not optimal, but I have not idea why the same sequence sometimes fails... maybe also the initialization could need some refinement.

4.

I know your bug. it is the same I have posted just upper in the discussion. To me, it happens only when I try to get a non-html file. Basically it seems that when the server closes the connection, the modem flushes the memory...losing the last received data! I have not found a solution yet. What's funny (sigh!) is that it does not happen without the SSL-related commands... I suspect it is a modem fw bug... The issue is clear from your logs. Success log:

AT+CIPSTATUS=0 // Querying the status after the send

+CIPSTATUS: 0,0,"TCP","<MY_IP>","8082","REMOTE CLOSING" // Remote closed!

OK
AT+CIPRXGET=4,0 // Asking if there is any data

+CIPRXGET: 4,0,390 // Yes, there is!

OK

Failing log:

AT+CARECV? // Asking if there is any data...

+CADATAIND: 0 // Async message: there is data!

+CASTATE: 0,0 // Async message: connection closed
[40086] ### Closed:  0

+CARECV: 0,0 // The answer to the question: no data!
+CARECV: 1,0

I have not found a workaround. I hope someone could contribute. @marmotton has proposed a workaround, by avoiding the automatic setting of HTTP headers, but I have not tested it. @magillus, you could try it and let us know.


Finally, I am not an expert of modems/at-commands, and I have no time to improve this code at the moment. I contributed to this code in the hope people could also contribute to make it work properly, so any suggestion is more than welcome by me.

Regards

magillus commented 3 years ago

1.

I am not sure about your proposed change: "AT+CSSLCFG="sslversion" wants a ctxindex, not a mux (aka "cid" in the manual). So I am not sure that a mux/cid is the same of a ctxindex (and it should be not, since ctxindes is 0-5 and cid is 0-1 with ssl or 0-7 without ssl). For what I understand, it is just a index for a "memory" where to store some information. So are you sure that the two things are the same? Can someone enlighten us please?

Thank you, it was confusing, lucky for me I got both set to 0 for my code.

I will continue investigating the difference in this PR for that http error.

Did anyone tried AT+SMCONN type of commands for connection - it was posted at here as working. https://github.com/botletics/SIM7000-LTE-Shield/blob/master/SIM7000%20Documentation/AT%20Command%20Logs/SIM7000_AWS_Log.txt

Maybe I should focus on using SMCONN and related SMSSL commands for MQTT?

FStefanni commented 3 years ago

Hi,

I do not see that group of commands in the AT manual... can you please tell me where to find a reference?

Nevertheless, I am wondering if it is possible to mix the AT command sets, i.e.:

If it works, it could be the solution to our problems...

Regards

magillus commented 3 years ago

https://github.com/botletics/SIM7000-LTE-Shield/blob/master/SIM7000%20Documentation/AT%20Command%20Logs/SIM7000_AWS_Log.txt Here is some log with AT commands, also @gsvitak was running on micro python same commands and were working.

Here is some appendix https://github.com/botletics/SIM7000-LTE-Shield/blob/master/SIM7000%20Documentation/Technical%20Documents/SIM7000%20Series_MQTT(S)_Application%20Note_V1.02.pdf for those commands that I found

FStefanni commented 3 years ago

Hi,

ok thank you. For what I see, it seems that MQTT protocol is directly supported by the modem. This is something different of what this library does: this is foucused on the TCP (maybe UDP) connection with optional TLS. So this does not help us to fix the issues with this pr unfortunately.

For what I have testsd, MQTT(S) is working fine with this pr, by using PubSubClient library. Using AT commands for MQTT is a completely different approach.

Regards.

magillus commented 3 years ago

I tried MQTTS with your PR and it hangs on opening connection (CAOPEN at command). I was trying to connect to AWS IoT with the PubSubClient like you did. I try again at end of week.

SRGDamia1 commented 3 years ago

I've spent a bunch of time working with this on my own SIM7000A and I cannot get it to work. The "application" commands (AT+CA..) consistently fail, whether I attempt a secure or insecure connection. I'm guessing it has something to do with the region/firmware/revision. The "TCP toolkit" (AT+CIP..) command work for me without any hang-ups. I decided to add a level of hierarchy and split the SIM7000 into a SIM7000 as it was before and a SIM7000SSL with the "application" commands. @FStefanni could you please test out the current SIM7000SSL with your module to make sure I didn't get anything wrong in moving things?

Thank you!

FStefanni commented 3 years ago

Hi,

I am not using the sim7000 right now, but I'll test asap.

BTW, I think that the sim7000 is not the most reliable modem, the AT commands are quite strange, and its behavior is not consistent. That's why i changed, and that's why I believe you have done a good job to split it into two.

Regards.

SRGDamia1 commented 3 years ago

I finally got this to work on my module. Apparently activating the GPRS context (AT+CGACT) causes the application activation (AT+CNACT) to fail. In moving things, I'd left that in. It makes no sense to me because I thought the application required an activated GPRS context, but I guess the module wants to do things differently. I really wish SIMCOM wouldn't decide to reinvent the wheel for every module.

I definitely hit the problem you mentioned as 4 in https://github.com/vshymanskyy/TinyGSM/pull/507#issuecomment-815505300 with the remote closing the connection and the data disappearing into the ether. It's got to be a modem firmware bug - the data is just gone once the remote closes. I couldn't find a way to query it fast enough. The only way I got it to work was by using keep-alive headers for HTTP(S). For anyone who doesn't absolutely need on-module SSL support, I'd recommend they use the TCP toolkit version rather than this application version so they don't hit this bug.

FStefanni commented 3 years ago

Hi,

I tested briefly the latest commit for the SSL version, but it does not work with MQTT. Basically, it seems unable to connect to MQTT broker, since the connection seems to became closed after a few time.

@SRGDamia1 did you tested the mqtt? Does it works for you?

I have done a quick diff between what I submitted and the latest code, but is quite different and it is not easy to spot the actually important differences... maybe it is the PDP context which does the difference?

Follows the AT log, with few comments:

13:08:35.145 -> Gprs connected: AT+CGATT?
13:08:35.145 -> 
13:08:35.145 -> +CGATT: 1
13:08:35.145 -> 
13:08:35.145 -> OK
13:08:35.145 -> AT+CNACT?
13:08:35.179 -> 
13:08:35.179 -> +CNACT: 1,"10.33.112.78"
13:08:35.213 -> 
13:08:35.213 -> OK
13:08:35.213 -> true                                 // setup of connection: fine! :)
13:08:35.213 -> AT+CARECV?
13:08:38.214 -> AT+CASTATE?
13:08:41.241 -> Connecting to test.mosquitto.orgAT+CARECV? // connecting to a test broker
13:08:44.242 -> AT+CASTATE?
13:08:47.250 -> AT+CARECV?
13:08:50.278 -> AT+CASTATE?
13:08:53.277 -> AT+CACLOSE=0                                          // this seems the isssue...
13:08:56.298 -> AT+CACID=0
13:10:11.314 ->  fail                                           // broker connection failed!
13:10:11.314 -> AT+CARECV?
13:10:14.314 -> AT+CASTATE?
13:10:17.342 -> AT+CEREG?
13:10:18.366 -> AT+CGREG?
13:10:19.354 -> Waiting for network...                  // restarting the connection, etc.
13:10:19.354 -> AT+CEREG?
13:10:20.374 -> AT+CGREG?
13:10:21.633 -> AT+CEREG?
13:10:22.656 -> AT+CGREG?
13:10:23.913 -> AT+CEREG?
13:10:24.903 -> AT+CGREG?
13:10:26.168 -> AT+CEREG?
13:10:27.203 -> AT+CGREG?
13:10:28.465 -> AT+CEREG?
13:10:29.453 -> AT+CGREG?

BTW, why the modem reply is no more logged?

Regards

SRGDamia1 commented 3 years ago

I'm sorry; I hadn't tested MQTT. I think I've found the problem. The module really doesn't like being asked about the number of characters left in the buffer if no sockets are open. As in, stops responding until a power cycle doesn't like. I think I've got it fixed now.

FStefanni commented 3 years ago

Hi,

I confirm that now MQTT works again.

Regards.

janoacosta commented 1 year ago

Hi, I´m using SIM7000G to connect with SSL but i can´t read the response

`AT+CNACT=1

OK

+APP PDP: ACTIVE AT+CNACT?

+CNACT: 1,"10.32.131.231"

OK AT+CACID=1

OK AT+CSSLCFG="ignorertctime",1,1

OK AT+CSSLCFG="sslversion",1,3

OK AT+CASSLCFG=1,"ssl",1

OK AT+CASSLCFG=1,"crindex",1

OK AT+CASSLCFG=1,"protocol",0

OK AT+CSSLCFG="ctxindex",1

+CSSLCFG: 1,3,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,0x0000,1,1,""

OK AT+CAOPEN=1,"httpbin.org",443

+CAOPEN: 1,0

OK

AT+CARECV?

OK AT+CASEND=1,44,10000

GET /get HTTP/1.1

Connection: keep-alive

OK

+CADATAIND: 1

+CASTATE: 1,0

+CACLOSE: 1`

FStefanni commented 1 year ago

Hi,

sim7000 seems to be very problematic with TLS, and not working properly in all cases. Unfortunately I have no idea on how to improve it (and actually nor I have the the time, since it was for an old job). This lib repo is full of issues about this: try to reply to some of them, maybe someone is able to fix it.

Regards