Closed Meiqingx closed 11 months ago
Due to the limitations listed above, we have kept the same data structure of two separate tables: one table with text fields and another table with non-text values (the two listed above). When we use them for internal data production (esp. running classifiers), I simply merged these two tables into an upstream table for whichever classifier that will take an iteration of these variables as input.
Current separation of text table and variable table is a decision based on: 1) potential user needs (researchers who focus on different aspects of political advertising) 2) data size limitations on github 3) incompatible text variable encoding across different operating systems.
But we are working on possibilities to better accommodate internal data production.