Open OracPrime opened 3 days ago
Please note that we don't really need to be careful/sanitize parameters, because you should not use node-pg-migrate at runtime, but only for migration progresses.
That aside, what would you propose as an API?
And how does a current use-case look like right now (rough example).
The current use case if for populating a static lookup table. I'd agree in theory you're only using your own data, so you should feel slightly safe, but in an open source environment an innocuous update of a data file could become an injection attack, so I want to be paranoid.
I think my proposed api is to have a function that looks exactly like sql but which doesn't mustache.
Current use case it this (previously insert with $1... $5)
async up(pgm) {
pgm.createTable("uk_stations", {
station_id: { notNull: true, primaryKey: true, type: "bigserial" },
station_name: { notNull: true, type: "text" },
sixteen_character_name: { notNull: true, type: "text" },
longitude: { notNull: true, type: "float8" },
latitude: { notNull: true, type: "float8" },
crs_code: { notNull: true, type: "char(3)" },
});
const filePath = path.join(__dirname, "../ukStations.csv");
const fileContent = fs.readFileSync(filePath, "utf8");
const records = parse(fileContent, {
columns: true,
skip_empty_lines: true,
});
// Prepare data for bulk insertion
const values = records.map(
({
station_name,
sixteen_character_name,
latitude,
longitude,
crs_code,
}) => [
station_name,
sixteen_character_name,
latitude,
longitude,
crs_code,
],
);
const query = format(
`INSERT INTO uk_stations (station_name, sixteen_character_name, latitude, longitude, crs_code) VALUES %L`,
values,
);
pgm.sql(query);
},
Just me thinking out loud: Not sure if we should provide a insert
(and then update and delete) statement, because these are not really anymore migrations but populations 🤔
Currently I would suggest to use node-pg-migration as a migration tool to have your DB versioned and use other tools (potentially knex) for population after the migration was executed.
However, I even know from myself in company work that there is best practice and real world scenario 🤣 ...
So maybe we could add just the absolutely simplest form of such functions that in the end will more or less just to what you already do yourself with the format
function.
pgm.insert('uk_station', {
station_name: '...',
// data for one station...
});
pgm.insertBulk('uk_station', [
{
station_name: '...',
// data for one station...
},
// ...
]);
I have not thought that fully out yet, but I would be okay with it if you explore an initial draft PR on your own with that info and then I can review and see where it is going.
Sounds good?
Description
It is currently quite messy to use parameterised inserts. If you naively use pgm.sql with $1 style parameters you get an error message related to regular expressions which is somewhat impenetrable if your not expecting mustache templating. If you use mustache templating, you are open to injection attacks. The only solution I can find is using pg-format for build the string (and I'm a bit worried if the result of that itself contains, for example $1.
Suggested solution
It would be nice to have access to an equivalent to pgm.sql() which doesn't do the mustaching. psm.sql_raw() for example. Either a new function, or an additional options argument on the existing one. However I prefer the separate function approach as it keeps the meaning simple and the option argument might be confused with an argument for the sql.
Alternative
pg-format is the only one I'm aware of.
Additional context
At minimum the regex error should be decorated with some explanation?