veda-consulting-company / uk.co.vedaconsulting.mailchimp

Other
22 stars 43 forks source link

Code to order Mailchimp Groups may not be working as expected? #240

Closed xurizaemon closed 7 years ago

xurizaemon commented 7 years ago

I spent a bit of time (#238) around CRM_Mailchimp_Sync::splitMailchimpWebhookGroupsToCiviGroupIds() and noticed there's code in that function which aims to ensure Mailchimp Groups matched are ordered with longest interest name first ...

It's not clear to me why this is/was important but removing it might reintroduce some problems if group IDs aren't sorted in order of Mailchimp group name length 💥 ... was it to avoid matching an interest group that is a substring of another group?

Anyway - if that ordering is significant, it too seems to not work as the comments describe. Here's an example:

$map = ['abcdef' => 6, 'abc' => 3];
uksort($map, function ($a, $b) {
  return strlen($a) - strlen($b);
});
// => ['abc' => 3, 'abcdef' => 6]

To sort by length of interest name, this should be right?

uksort($map, function ($a, $b) {
  return strlen($a) < strlen($b);
});

But for me, it wasn't necessary to address this when identifying a fix for #238, so I'll leave that for feedback from developers.

xurizaemon commented 7 years ago

🎯 I found it, I think!

When I wrote the above, I thought I was looking at Mailchimp Groupings (ie the labels as shown in Mailchimp). However this block of code looks like it's referring to CiviCRM Groups (& accessing them via CiviCRM API, in order to get both Regular and Smart Groups).

Again, what I'm seeing doesn't gel with what's documented. In CiviCRM 4.7.11, I see this -

$result = civicrm_api3('Contact', 'get', array(
  'sequential' => 1,
  'return' => array("group"),
  'id' => 990,
));

// result:
{
  "is_error": 0,
  "version": 3,
  "count": 1,
  "id": 990,
  "values": [
    {
      "contact_id": "990",
      "groups": ",134,135",
      "id": "990"
    }
  ]
}

But the comments here suggest the author was seeing CiviCRM return a list of group names, and working around them possibly containing commas by scanning for matches longest first.

What I see returned is a list of CiviCRM Group IDs, which means the treatment is redundant, and possibly won't match ANY groups.

Just keeping notes here. Enjoy the show folks. (Uh, CiviCRM, what's with the prepended comma?)

xurizaemon commented 7 years ago

OK ... well, I got this to behave a lot better with the following.

public static function splitGroupTitles($group_titles, $group_details) {
  if (preg_match('/^[0-9,]+$/', $group_titles)) {
    return array_filter(explode(',', $group_titles));
  }
}

But I'm not going to submit a PR with that until I see it pass some tests, so ... just making notes, then 😴 💤

Also CiviCRM, what's with 'return' => 'group' then returning a value named group*s*?

xurizaemon commented 7 years ago

From #233.

CiviCRM's API changed its response to Contact:getsingle with return=group: it used to return a comma separated list of group titles, it now returns group ids!

Grrr. In order to fix this I'll have to have a way to know which exact version of CiviCRM changed this behaviour, and then check for that in the code and handle it two different ways depending on that.

The code in the previous comment handles this to some degree ... if it sees a string matching integers and commas, it treats it as group IDs, without caring what CiviCRM version we're on. This should work regardless of what version that behaviour changed - excepting cases where people have made CiviCRM Groups named with only commas and numbers I guess.

I asked about this behaviour change in CiviCRM chat a few days ago.

xurizaemon commented 7 years ago

Also - I'm not convinced the above does the full filter required of this function - so the array_filter() might not be the whole deal here.

I don't see any coverage of the value returned from 'groups' in CiviCRM's API tests, which means this might have landed without anyone knowing ...

artfulrobot commented 7 years ago

Oh, and on your first comment, I think your demo code is not equivalent. You have $a - $b whereas my code is the other way around. I believe my sorting code works, and returning three options (-ve, 0 or +ve) instead of a boolean ought to result in a faster sort (although of course realistically we're talking about only sorting a very few values!)

xurizaemon commented 7 years ago

No stress - TBH I was trying to make sense of why it happened, and I think I've ended up looking at both Mailchimp and CiviCRM group lists here. So code snippets here are just for discussion.

Anyway I found some more info on this unexpected change in behaviour:

Note that code to parse group lists won't match for Mailchimp and CiviCRM, so ... With three groups named 'Foo', 'Bar', and 'Ga, zonk', we will see:

artfulrobot commented 7 years ago

Splitting old-style CiviCRM group name strings: did you see the unit test - should be fine.

Talking of data from Mailchimp, this happens from the webhook and as an API response.

The MC API response is handled at https://github.com/artfulrobot/uk.co.vedaconsulting.mailchimp/blob/master/CRM/Mailchimp/Sync.php#L891 and the MC API does not concat group titles here but instead returns nice structured data.

The Webhook requests are handled in splitMailchimpWebhookGroupsToCiviGroupIds(). I've reworked that code and moved it into the Utils class see splitGroupTitlesFromMailchimp() and added a test testGroupTitleSplittingFromMailchimp()

Look good?

deepak-srivastava commented 7 years ago

The work seems to have been merged and v2.x is out.