formtools / module-data_visualization

The Data Visualization module.
https://docs.formtools.org/modules/data_visualization/
7 stars 9 forks source link

Possible Inaccurate Date Range #21

Closed NateSchwarz closed 5 years ago

NateSchwarz commented 5 years ago

Hi Ben,

I'm having an issue with the date range accurately pulling in data. I'm using the latest version of form tools and data vis with the new date range abilities on field charts we messaged about a month or so ago.

On the field charts, the date range (specifically year-to-date and previous 30 days) seem to be doing something in terms of pulling data, but I'm noticing some discrepancies. The year-to-date seems to be working pulling in an amount of submissions that makes sense with all my filtering. However, the previous 30 day range is pulling in a very similar amount of submissions. To clarify, the two definitely shouldn't be pulling anywhere close to the same number of submissions.

If this makes sense, there are about 4000 submissions over the past two-ish years. (Just as an example) I'm filtering how many people chose the color blue as an option. I'm using a view which parses out all submissions with the option "blue" then I'm then trying to use the YTD and previous 30 days filters to see how many people selected that option over those two time intervals.

I don't know if I'm missing something, or I'm not setting up the visualization correctly, but it looks like the time filter is doing something because the numbers do change slightly (a few submissions here and there) between the two time filters, just not in anyway that makes sense or is accurate.

I dug around the code and was able to figure out (at least I think) that the filters are checking the submission date to do their filtering. So I'm wondering if there is bug somewhere?

I could be missing something blatantly obvious and apologize if I am. I'm just wondering what could be causing inaccuracy in the filtering.

benkeen commented 5 years ago

Hey @NateSchwarz, thanks for the heads up. Would you mind taking a screenshot of your visualization configuration (or just jotting down all the settings) and posting them here? From what you've described it certainly sounds like a bug - but that'll help me locate it better. I've created a ticket on the Data Visualization module to track it, but maybe respond on this ticket.

Thanks!

NateSchwarz commented 5 years ago

No Problem,

Chart Type: Field Chart - Bar Cache: Do Not Cache Date Range to Display: Last 30 Days Ignore Empty Fields: Yes

Access: Public Where Shown: All Views

For the view and field, I have a specific view that only pulls in one option from a drop down and then for the field I have a star rating 1-5 (check boxes)

If theres anything else you need just let me know, thanks for looking into it!

benkeen commented 5 years ago

Sorry I haven't gotten to this yet, Nate. I'd planned on doing it last weekend but wanted to get the new File Upload module released, which is taking much longer than I thought to finish off. Pretty close now though & I'll look at this when I'm done.

NateSchwarz commented 5 years ago

No worries, thanks for the update!

benkeen commented 5 years ago

Brilliant, thanks for this Nate. Yes, it's a bug, I've just been able to reproduce it.

I'm going to try a new trick with this issue & move it to the Data Visualization module (new feature that github offers!). But please do continue reporting everything in Core - it seems like a good way to group everything.

benkeen commented 5 years ago

Hey @NateSchwarz, it looks like the query isn't factoring in whether the submission is finalized or not, which is clearly incorrect. Is this an API form where you're seeing this problem? That would explain the discrepancies you're seeing.

Let me know and I'll wait to update the module with the fix, in case it's something else.

NateSchwarz commented 5 years ago

Hi Ben,

No I'm not running any APIs for it. It's just a form builder form run on page using an iframe so nothing special.

benkeen commented 5 years ago

Thanks Nate. Would you mind try running this query on your database? Just want to know if it returns the correct current time. The Field Charts queries return results relative to that info, so wanted to rule that out.

SELECT NOW();
NateSchwarz commented 5 years ago

Yep, returned the correct current time and date

benkeen commented 5 years ago

Hmph! I gotta say, nothing is standing out as incorrect besides the is_finalized thing mentioned above.

benkeen commented 5 years ago

Let's do a little brain surgery and get it to output to the logs the exact SQL query it's using.

On line 165 of modules/data-visualization/code/FieldCharts.class.php you'll see this chunk:

if ($ignore_empty_fields == "yes") {
    $db->query("
        SELECT $col_name as field_value, count(*) as count
        FROM   {PREFIX}form_{$form_id}
        WHERE  1=1
            AND $col_name IS NOT NULL AND TRIM($col_name) != ''
               $filter_where_clause
               $date_range_clause
        GROUP BY field_value
    ");
} else {

Change that to:

if ($ignore_empty_fields == "yes") {
    $query = "
        SELECT $col_name as field_value, count(*) as count
        FROM   {PREFIX}form_{$form_id}
        WHERE  1=1
            AND $col_name IS NOT NULL AND TRIM($col_name) != ''
               $filter_where_clause
               $date_range_clause
        GROUP BY field_value
   ";
   $db->query($query);
   error_log($query);
} else {

That'll now log the exact query that it's using to your PHP file. If it's not sensitive, mind posting it here? Anything suspicious in the query - wrong dates being applied, or extra fields being checked against?

NateSchwarz commented 5 years ago

Hey Ben,

I gave that a try, where exactly am I looking for the logging? I assumed the error log but nothing was updated.

benkeen commented 5 years ago

Yeah, the PHP logs should be where you see it. Strange you don't... the location you edited was for Field Charts with ignore empty fields = "yes" and would execute when it's not cached, which seems correct for your form.

But actually, I think it might be moot. I realized that other fix I made to this repo may also fix your scenario. Form Builder forms also use the is_finalized flag. Basically that DB field is used to mark a submission in completed or not, so if a user doesn't finish putting through a form, that incomplete info never shows up in Form Tools. As such, the Field Chart may well be showing you invalid data because of that.

Try this: change that one line in the SQL statement above from WHERE 1=1 to WHERE is_finalized = 'yes'. Also make the same change a few lines lower in the next DB query.

Let me know how your chart looks after that change.

NateSchwarz commented 5 years ago

I tried that and the charts aren't looking any different or outputting any different data unfortunately.

NateSchwarz commented 5 years ago

Hey Ben,

I just wanted to let you know, I found an anomaly with the dates in my data after I did the last import a few nights ago. There are about 5000 records I had to transition from an old database to the new one I'm working with, and a chunk of the dates somehow got corrupted between excel and phpmyadmin. I caught it the other night when I noticed I had entries from 2031 which led me to believe as I kept exporting and importing the problem compounded on itself, and I didn't realise until it was so severe.

I'm still working with the dates to make sure they are correct, but the problem may have not been in the code after all.

benkeen commented 5 years ago

Ahh, thanks very much for letting me know @NateSchwarz. I was quite out of ideas.

I will release a new version of the module tonight though, to patch that one issue I found.

benkeen commented 5 years ago

Feel free to re-open if you find it's still an issue.