Closed jamesschlader closed 4 years ago
Here's a rough draft of an implementation that I think would be a lot easier for people to read and understand the code:
const subsections = []
walkSection(lines, sections, "ACCOUNT SUMMARY", (line, index) => {
const words = line.split(/\s/)
const isSubsection = words.length > 0 && words[0].toUpperCase() === words[0];
if (isSubsection) {
subsections.push({name: line, lineNumber: index})
}
})
const balances = subsections.map((subsection, arrayIndex) => {
const endIndex = subsections[arrayIndex + 1].lineNumber // to-do make this work for last item in array, too
let amountDue, amountPaid, costType
for (let i = subsection.lineNumber; i < endIndex; i++) {
const line = lines[i];
const words = line.split(/\s/);
if (line.includes('Amount Due')) {
amountDue = words[words.length - 1];
} else if (line.includes('Amount Paid') || line.includes('Paid In')) {
amountPaid = words[words.length - 1];
}
costType = getCostType(line) || costType;
}
return {
name: subsection.name,
amountDue,
amountPaid,
costType,
}
})
return balances
function getCostType(line) {
if (line.includes('Restitution')) {
return 'restitution'
}
}
New commits based on feedback.
I have resolved all of the github comments that are now addressed. There are three comments remaining, and also the tests are failing. Once those comments are addressed and the build passes, I'll approve and merge.
This offers an opinionated method for parsing the Account Summary section of a Docket pdf. The parseAccountSummary function returns an object that summarizes all of the accounts with associated values for this Docket. All the comments and rubbish are removed during parsing.
Rudimentary tests are also included.