chriltola / mail

0 stars 0 forks source link

Epolicy Index #39

Open chriltola opened 1 month ago

chriltola commented 1 month ago
#include <iostream>
#include <fstream>
#include <string>
#include <vector>

std::vector<std::vector<std::string>> extractDataBetweenLines(const std::string& filePath) {
    std::ifstream inputFile(filePath);
    std::vector<std::vector<std::string>> allCapturedData;
    std::vector<std::string> currentData;
    std::string line;
    bool capture = false;

    if (!inputFile.is_open()) {
        std::cerr << "Error opening file: " << filePath << std::endl;
        return allCapturedData;
    }

    while (std::getline(inputFile, line)) {
        if (line == "</DOCUMENTS>") {
            if (capture) {
                capture = false;
            }
        }
        else if (line == "<DOCUMENTS>") {
            capture = true;
        }
    }

    inputFile.close();
    return allCapturedData;
}

int main() {
    const std::string filePath = "ENDT.1FF.NOTE.DMS.Y23FF20000991-001.20230810.105230"; // Replace with your file path
    std::vector<std::vector<std::string>> allData = extractDataBetweenLines(filePath);

    std::cout << "Captured Data:" << std::endl;
    for (const auto& section : allData) {
        for (const auto& line : section) {
            std::cout << line << std::endl;
        }
        std::cout << "------------------------" << std::endl;
    }

    return 0;
}

Write cpp code to capture data between and .

chriltola commented 1 month ago

To capture data between <DOCUMENTS> and </DOCUMENTS> tags and store them in a nested vector, you need to adjust your code to correctly handle capturing the lines and storing them. Here's the modified version of your code:

#include <iostream>
#include <fstream>
#include <string>
#include <vector>

std::vector<std::vector<std::string>> extractDataBetweenLines(const std::string& filePath) {
    std::ifstream inputFile(filePath);
    std::vector<std::vector<std::string>> allCapturedData;
    std::vector<std::string> currentData;
    std::string line;
    bool capture = false;

    if (!inputFile.is_open()) {
        std::cerr << "Error opening file: " << filePath << std::endl;
        return allCapturedData;
    }

    while (std::getline(inputFile, line)) {
        if (line == "</DOCUMENTS>") {
            if (capture) {
                allCapturedData.push_back(currentData); // Save the current captured data
                currentData.clear(); // Clear current data for the next section
                capture = false;
            }
        } else if (line == "<DOCUMENTS>") {
            capture = true;
        } else if (capture) {
            currentData.push_back(line); // Collect data lines
        }
    }

    inputFile.close();
    return allCapturedData;
}

int main() {
    const std::string filePath = "ENDT.1FF.NOTE.DMS.Y23FF20000991-001.20230810.105230"; // Replace with your file path
    std::vector<std::vector<std::string>> allData = extractDataBetweenLines(filePath);

    std::cout << "Captured Data:" << std::endl;
    for (const auto& section : allData) {
        for (const auto& line : section) {
            std::cout << line << std::endl;
        }
        std::cout << "------------------------" << std::endl;
    }

    return 0;
}

Explanation:

  1. File Opening Check: It checks if the file can be opened successfully.
  2. Reading Lines: Reads the file line by line.
  3. Tag Detection: Uses flags to start and stop capturing lines between <DOCUMENTS> and </DOCUMENTS>.
  4. Data Collection: Captures lines between the tags and stores them in a nested vector.
  5. Output: Prints the captured data sections.

This ensures that all data between multiple occurrences of <DOCUMENTS> and </DOCUMENTS> are captured and printed correctly.

chriltola commented 1 month ago

this is abit wrong on flow. hence, let do as following flow:

1. first, line.find <EMAIL.ADDRESS> and in that line get data between <EMAIL.ADDRESS> and </EMAIL.ADDRESS> as global 

2. second, loop read every between line <DOCUMENT> and </DOCUMENT>:
   a. line.find <PDF.FILENAME.PREFIX> and in that line get data between <PDF.FILENAME.PREFIX> and </PDF.FILENAME.PREFIX> 
   b. line.find <NAME> and in that line get data between <NAME> and </NAME>
   c. line.find "Company/Customer's Name :" and in that line get data after "Date        :"
   d. line.find "Invoice No. :" and in that line get data after "Invoice No. :"
   e. line.find <RECORDNUMBER> and in that line get data between <RECORDNUMBER> and </RECORDNUMBER>
   f. line.find <TITLE> and in that line get data between <TITLE> and </TITLE>
   g. write result to CSV file with following format: PDF.FILENAME.PREFIX,NAME,EMAIL.ADDRESS,Company/CustomerName,Invoice,RECORDNUMBER,TITLE
chriltola commented 1 month ago

To achieve this, we'll need to modify the code to handle the specific parsing requirements. Here's how we can implement the flow you described:

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>

// Global variable to store the email address
std::string globalEmailAddress;

// Function to extract a substring between two delimiters
std::string extractBetween(const std::string& line, const std::string& startDelimiter, const std::string& endDelimiter) {
    std::size_t startPos = line.find(startDelimiter);
    if (startPos == std::string::npos) return "";
    startPos += startDelimiter.length();
    std::size_t endPos = line.find(endDelimiter, startPos);
    if (endPos == std::string::npos) return "";
    return line.substr(startPos, endPos - startPos);
}

void processDocumentSection(std::ifstream& inputFile, std::ofstream& outputFile) {
    std::string line;
    std::string pdfFileNamePrefix, name, companyName, invoiceNo, recordNumber, title;

    while (std::getline(inputFile, line)) {
        if (line.find("</DOCUMENT>") != std::string::npos) {
            break;
        }
        if (line.find("<PDF.FILENAME.PREFIX>") != std::string::npos) {
            pdfFileNamePrefix = extractBetween(line, "<PDF.FILENAME.PREFIX>", "</PDF.FILENAME.PREFIX>");
        } else if (line.find("<NAME>") != std::string::npos) {
            name = extractBetween(line, "<NAME>", "</NAME>");
        } else if (line.find("Company/Customer's Name :") != std::string::npos) {
            companyName = line.substr(line.find("Company/Customer's Name :") + std::string("Company/Customer's Name :").length());
        } else if (line.find("Invoice No. :") != std::string::npos) {
            invoiceNo = line.substr(line.find("Invoice No. :") + std::string("Invoice No. :").length());
        } else if (line.find("<RECORDNUMBER>") != std::string::npos) {
            recordNumber = extractBetween(line, "<RECORDNUMBER>", "</RECORDNUMBER>");
        } else if (line.find("<TITLE>") != std::string::npos) {
            title = extractBetween(line, "<TITLE>", "</TITLE>");
        }
    }

    // Write the captured data to the CSV file
    outputFile << pdfFileNamePrefix << ',' 
               << name << ','
               << globalEmailAddress << ',' 
               << companyName << ',' 
               << invoiceNo << ',' 
               << recordNumber << ',' 
               << title << '\n';
}

void extractDataAndWriteToCSV(const std::string& filePath, const std::string& outputCSVPath) {
    std::ifstream inputFile(filePath);
    std::ofstream outputFile(outputCSVPath);

    if (!inputFile.is_open()) {
        std::cerr << "Error opening file: " << filePath << std::endl;
        return;
    }
    if (!outputFile.is_open()) {
        std::cerr << "Error opening output file: " << outputCSVPath << std::endl;
        return;
    }

    outputFile << "PDF.FILENAME.PREFIX,NAME,EMAIL.ADDRESS,Company/CustomerName,Invoice,RECORDNUMBER,TITLE\n";

    std::string line;
    while (std::getline(inputFile, line)) {
        // Step 1: Capture the email address
        if (line.find("<EMAIL.ADDRESS>") != std::string::npos) {
            globalEmailAddress = extractBetween(line, "<EMAIL.ADDRESS>", "</EMAIL.ADDRESS>");
        }
        // Step 2: Process each <DOCUMENT> section
        else if (line.find("<DOCUMENT>") != std::string::npos) {
            processDocumentSection(inputFile, outputFile);
        }
    }

    inputFile.close();
    outputFile.close();
}

int main() {
    const std::string filePath = "ENDT.1FF.NOTE.DMS.Y23FF20000991-001.20230810.105230"; // Replace with your file path
    const std::string outputCSVPath = "output.csv"; // Replace with desired output CSV file path

    extractDataAndWriteToCSV(filePath, outputCSVPath);

    std::cout << "Data extraction completed. Results written to " << outputCSVPath << std::endl;

    return 0;
}

Explanation:

  1. Global Email Address:

    • A global variable globalEmailAddress is used to store the email address found at the beginning of the document.
  2. Data Extraction:

    • The extractBetween() function extracts a substring between two specified delimiters.
    • The processDocumentSection() function is responsible for capturing data between <DOCUMENT> and </DOCUMENT> tags.
    • Each relevant line is checked, and the desired data is extracted.
  3. CSV Output:

    • The data is written to a CSV file in the specified format.
  4. Main Function:

    • The extractDataAndWriteToCSV() function manages the overall process: it reads the file, captures the email address, and processes each <DOCUMENT> section.
    • The results are saved to an output CSV file.

This code will read through the specified file, extract the required data, and write the results into a CSV file, following the structure you've requested.

chriltola commented 1 month ago

i want to change:

c. line.find "Company/Customer's Name :" and in that line get data after "Date        :" as global
d. line.find "Invoice No. :" and in that line get data after "Invoice No. :" as global
chriltola commented 1 month ago

To modify the code as requested, we'll adjust the flow to store the data extracted from "Company/Customer's Name :" and "Invoice No. :" as global variables. Here's the updated version of the code:

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>

// Global variables to store the email address, company name, and invoice number
std::string globalEmailAddress;
std::string globalCompanyName;
std::string globalInvoiceNo;

// Function to extract a substring between two delimiters
std::string extractBetween(const std::string& line, const std::string& startDelimiter, const std::string& endDelimiter) {
    std::size_t startPos = line.find(startDelimiter);
    if (startPos == std::string::npos) return "";
    startPos += startDelimiter.length();
    std::size_t endPos = line.find(endDelimiter, startPos);
    if (endPos == std::string::npos) return "";
    return line.substr(startPos, endPos - startPos);
}

// Function to extract data after a specific delimiter
std::string extractAfter(const std::string& line, const std::string& delimiter) {
    std::size_t pos = line.find(delimiter);
    if (pos == std::string::npos) return "";
    return line.substr(pos + delimiter.length());
}

void processDocumentSection(std::ifstream& inputFile, std::ofstream& outputFile) {
    std::string line;
    std::string pdfFileNamePrefix, name, recordNumber, title;

    while (std::getline(inputFile, line)) {
        if (line.find("</DOCUMENT>") != std::string::npos) {
            break;
        }
        if (line.find("<PDF.FILENAME.PREFIX>") != std::string::npos) {
            pdfFileNamePrefix = extractBetween(line, "<PDF.FILENAME.PREFIX>", "</PDF.FILENAME.PREFIX>");
        } else if (line.find("<NAME>") != std::string::npos) {
            name = extractBetween(line, "<NAME>", "</NAME>");
        } else if (line.find("<RECORDNUMBER>") != std::string::npos) {
            recordNumber = extractBetween(line, "<RECORDNUMBER>", "</RECORDNUMBER>");
        } else if (line.find("<TITLE>") != std::string::npos) {
            title = extractBetween(line, "<TITLE>", "</TITLE>");
        }
    }

    // Write the captured data to the CSV file
    outputFile << pdfFileNamePrefix << ',' 
               << name << ','
               << globalEmailAddress << ',' 
               << globalCompanyName << ',' 
               << globalInvoiceNo << ',' 
               << recordNumber << ',' 
               << title << '\n';
}

void extractDataAndWriteToCSV(const std::string& filePath, const std::string& outputCSVPath) {
    std::ifstream inputFile(filePath);
    std::ofstream outputFile(outputCSVPath);

    if (!inputFile.is_open()) {
        std::cerr << "Error opening file: " << filePath << std::endl;
        return;
    }
    if (!outputFile.is_open()) {
        std::cerr << "Error opening output file: " << outputCSVPath << std::endl;
        return;
    }

    outputFile << "PDF.FILENAME.PREFIX,NAME,EMAIL.ADDRESS,Company/CustomerName,Invoice,RECORDNUMBER,TITLE\n";

    std::string line;
    while (std::getline(inputFile, line)) {
        // Step 1: Capture the email address
        if (line.find("<EMAIL.ADDRESS>") != std::string::npos) {
            globalEmailAddress = extractBetween(line, "<EMAIL.ADDRESS>", "</EMAIL.ADDRESS>");
        }
        // Step 2: Capture the company name (global)
        else if (line.find("Company/Customer's Name :") != std::string::npos) {
            globalCompanyName = extractAfter(line, "Company/Customer's Name :");
        }
        // Step 3: Capture the invoice number (global)
        else if (line.find("Invoice No. :") != std::string::npos) {
            globalInvoiceNo = extractAfter(line, "Invoice No. :");
        }
        // Step 4: Process each <DOCUMENT> section
        else if (line.find("<DOCUMENT>") != std::string::npos) {
            processDocumentSection(inputFile, outputFile);
        }
    }

    inputFile.close();
    outputFile.close();
}

int main() {
    const std::string filePath = "ENDT.1FF.NOTE.DMS.Y23FF20000991-001.20230810.105230"; // Replace with your file path
    const std::string outputCSVPath = "output.csv"; // Replace with desired output CSV file path

    extractDataAndWriteToCSV(filePath, outputCSVPath);

    std::cout << "Data extraction completed. Results written to " << outputCSVPath << std::endl;

    return 0;
}

Changes and Explanation:

  1. Global Variables:

    • Added globalCompanyName and globalInvoiceNo as global variables to store the extracted company name and invoice number, respectively.
  2. Extraction Functions:

    • extractAfter() function is added to extract data after a specific delimiter (e.g., after "Company/Customer's Name :" or "Invoice No. :").
  3. Processing the Document Section:

    • Inside the processDocumentSection() function, only the local variables specific to each <DOCUMENT> tag are handled. The global variables for the email, company name, and invoice number are used directly when writing to the CSV file.
  4. Main Extraction Loop:

    • Captures the email address, company name, and invoice number outside the <DOCUMENT> tags and stores them globally.

This setup ensures that the company name and invoice number are captured globally and used correctly when processing each <DOCUMENT> section.

chriltola commented 1 month ago

I understand now. You want to ensure that the company name and invoice number are captured within each <DOCUMENT> and </DOCUMENT> section. These should be treated similarly to the other fields, but as global variables for each document section. Here's how to update the code:

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>

// Global variables to store the email address, company name, and invoice number
std::string globalEmailAddress;
std::string globalCompanyName;
std::string globalInvoiceNo;

// Function to extract a substring between two delimiters
std::string extractBetween(const std::string& line, const std::string& startDelimiter, const std::string& endDelimiter) {
    std::size_t startPos = line.find(startDelimiter);
    if (startPos == std::string::npos) return "";
    startPos += startDelimiter.length();
    std::size_t endPos = line.find(endDelimiter, startPos);
    if (endPos == std::string::npos) return "";
    return line.substr(startPos, endPos - startPos);
}

// Function to extract data after a specific delimiter
std::string extractAfter(const std::string& line, const std::string& delimiter) {
    std::size_t pos = line.find(delimiter);
    if (pos == std::string::npos) return "";
    return line.substr(pos + delimiter.length());
}

void processDocumentSection(std::ifstream& inputFile, std::ofstream& outputFile) {
    std::string line;
    std::string pdfFileNamePrefix, name, recordNumber, title;

    while (std::getline(inputFile, line)) {
        if (line.find("</DOCUMENT>") != std::string::npos) {
            break;
        }
        if (line.find("<PDF.FILENAME.PREFIX>") != std::string::npos) {
            pdfFileNamePrefix = extractBetween(line, "<PDF.FILENAME.PREFIX>", "</PDF.FILENAME.PREFIX>");
        } else if (line.find("<NAME>") != std::string::npos) {
            name = extractBetween(line, "<NAME>", "</NAME>");
        } else if (line.find("Company/Customer's Name :") != std::string::npos) {
            globalCompanyName = extractAfter(line, "Company/Customer's Name :");
        } else if (line.find("Invoice No. :") != std::string::npos) {
            globalInvoiceNo = extractAfter(line, "Invoice No. :");
        } else if (line.find("<RECORDNUMBER>") != std::string::npos) {
            recordNumber = extractBetween(line, "<RECORDNUMBER>", "</RECORDNUMBER>");
        } else if (line.find("<TITLE>") != std::string::npos) {
            title = extractBetween(line, "<TITLE>", "</TITLE>");
        }
    }

    // Write the captured data to the CSV file
    outputFile << pdfFileNamePrefix << ',' 
               << name << ','
               << globalEmailAddress << ',' 
               << globalCompanyName << ',' 
               << globalInvoiceNo << ',' 
               << recordNumber << ',' 
               << title << '\n';
}

void extractDataAndWriteToCSV(const std::string& filePath, const std::string& outputCSVPath) {
    std::ifstream inputFile(filePath);
    std::ofstream outputFile(outputCSVPath);

    if (!inputFile.is_open()) {
        std::cerr << "Error opening file: " << filePath << std::endl;
        return;
    }
    if (!outputFile.is_open()) {
        std::cerr << "Error opening output file: " << outputCSVPath << std::endl;
        return;
    }

    outputFile << "PDF.FILENAME.PREFIX,NAME,EMAIL.ADDRESS,Company/CustomerName,Invoice,RECORDNUMBER,TITLE\n";

    std::string line;
    while (std::getline(inputFile, line)) {
        // Step 1: Capture the email address
        if (line.find("<EMAIL.ADDRESS>") != std::string::npos) {
            globalEmailAddress = extractBetween(line, "<EMAIL.ADDRESS>", "</EMAIL.ADDRESS>");
        }
        // Step 2: Process each <DOCUMENT> section
        else if (line.find("<DOCUMENT>") != std::string::npos) {
            processDocumentSection(inputFile, outputFile);
        }
    }

    inputFile.close();
    outputFile.close();
}

int main() {
    const std::string filePath = "ENDT.1FF.NOTE.DMS.Y23FF20000991-001.20230810.105230"; // Replace with your file path
    const std::string outputCSVPath = "output.csv"; // Replace with desired output CSV file path

    extractDataAndWriteToCSV(filePath, outputCSVPath);

    std::cout << "Data extraction completed. Results written to " << outputCSVPath << std::endl;

    return 0;
}

Explanation:

  1. Global Variables:

    • globalCompanyName and globalInvoiceNo are still global, but they are now updated within each <DOCUMENT> section rather than outside of it.
  2. Processing Logic:

    • Inside the processDocumentSection() function, the code now looks for Company/Customer's Name : and Invoice No. : within each <DOCUMENT> section. If found, it updates the global variables globalCompanyName and globalInvoiceNo accordingly.
  3. Main Extraction Loop:

    • The main loop still looks for <EMAIL.ADDRESS> before <DOCUMENT> sections and processes the document sections as before. Now, each document section will have its company name and invoice number captured and written to the CSV.

This ensures that the data for "Company/Customer's Name :" and "Invoice No. :" is captured correctly for each document section and associated with the other relevant fields.

chriltola commented 1 month ago

now everything is correct. but i want to add length to each output data as following:

  1. pdfFileNamePrefix 60 char
  2. name 40 char
  3. globalEmailAddress 50 char
  4. globalCompanyName 20 char
  5. globalInvoiceNo 20 char
  6. recordNumber 20 char
  7. title 40 char
chriltola commented 1 month ago

To ensure that each output data field in the CSV has a specific length, you can pad or truncate the strings to the desired length. Here's how you can modify the code to achieve this:

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
#include <iomanip>

// Global variables to store the email address, company name, and invoice number
std::string globalEmailAddress;
std::string globalCompanyName;
std::string globalInvoiceNo;

// Function to extract a substring between two delimiters
std::string extractBetween(const std::string& line, const std::string& startDelimiter, const std::string& endDelimiter) {
    std::size_t startPos = line.find(startDelimiter);
    if (startPos == std::string::npos) return "";
    startPos += startDelimiter.length();
    std::size_t endPos = line.find(endDelimiter, startPos);
    if (endPos == std::string::npos) return "";
    return line.substr(startPos, endPos - startPos);
}

// Function to extract data after a specific delimiter
std::string extractAfter(const std::string& line, const std::string& delimiter) {
    std::size_t pos = line.find(delimiter);
    if (pos == std::string::npos) return "";
    return line.substr(pos + delimiter.length());
}

// Function to format a string to a specific length
std::string formatString(const std::string& str, size_t length) {
    if (str.length() > length) {
        return str.substr(0, length);
    } else {
        return str + std::string(length - str.length(), ' ');
    }
}

void processDocumentSection(std::ifstream& inputFile, std::ofstream& outputFile) {
    std::string line;
    std::string pdfFileNamePrefix, name, recordNumber, title;

    while (std::getline(inputFile, line)) {
        if (line.find("</DOCUMENT>") != std::string::npos) {
            break;
        }
        if (line.find("<PDF.FILENAME.PREFIX>") != std::string::npos) {
            pdfFileNamePrefix = extractBetween(line, "<PDF.FILENAME.PREFIX>", "</PDF.FILENAME.PREFIX>");
        } else if (line.find("<NAME>") != std::string::npos) {
            name = extractBetween(line, "<NAME>", "</NAME>");
        } else if (line.find("Company/Customer's Name :") != std::string::npos) {
            globalCompanyName = extractAfter(line, "Company/Customer's Name :");
        } else if (line.find("Invoice No. :") != std::string::npos) {
            globalInvoiceNo = extractAfter(line, "Invoice No. :");
        } else if (line.find("<RECORDNUMBER>") != std::string::npos) {
            recordNumber = extractBetween(line, "<RECORDNUMBER>", "</RECORDNUMBER>");
        } else if (line.find("<TITLE>") != std::string::npos) {
            title = extractBetween(line, "<TITLE>", "</TITLE>");
        }
    }

    // Format each string to the specified length
    pdfFileNamePrefix = formatString(pdfFileNamePrefix, 60);
    name = formatString(name, 40);
    globalEmailAddress = formatString(globalEmailAddress, 50);
    globalCompanyName = formatString(globalCompanyName, 20);
    globalInvoiceNo = formatString(globalInvoiceNo, 20);
    recordNumber = formatString(recordNumber, 20);
    title = formatString(title, 40);

    // Write the formatted data to the CSV file
    outputFile << pdfFileNamePrefix << ',' 
               << name << ',' 
               << globalEmailAddress << ',' 
               << globalCompanyName << ',' 
               << globalInvoiceNo << ',' 
               << recordNumber << ',' 
               << title << '\n';
}

void extractDataAndWriteToCSV(const std::string& filePath, const std::string& outputCSVPath) {
    std::ifstream inputFile(filePath);
    std::ofstream outputFile(outputCSVPath);

    if (!inputFile.is_open()) {
        std::cerr << "Error opening file: " << filePath << std::endl;
        return;
    }
    if (!outputFile.is_open()) {
        std::cerr << "Error opening output file: " << outputCSVPath << std::endl;
        return;
    }

    outputFile << "PDF.FILENAME.PREFIX,NAME,EMAIL.ADDRESS,Company/CustomerName,Invoice,RECORDNUMBER,TITLE\n";

    std::string line;
    while (std::getline(inputFile, line)) {
        // Step 1: Capture the email address
        if (line.find("<EMAIL.ADDRESS>") != std::string::npos) {
            globalEmailAddress = extractBetween(line, "<EMAIL.ADDRESS>", "</EMAIL.ADDRESS>");
        }
        // Step 2: Process each <DOCUMENT> section
        else if (line.find("<DOCUMENT>") != std::string::npos) {
            processDocumentSection(inputFile, outputFile);
        }
    }

    inputFile.close();
    outputFile.close();
}

int main() {
    const std::string filePath = "ENDT.1FF.NOTE.DMS.Y23FF20000991-001.20230810.105230"; // Replace with your file path
    const std::string outputCSVPath = "output.csv"; // Replace with desired output CSV file path

    extractDataAndWriteToCSV(filePath, outputCSVPath);

    std::cout << "Data extraction completed. Results written to " << outputCSVPath << std::endl;

    return 0;
}

Explanation:

  1. formatString Function:

    • This function is responsible for ensuring that each string is formatted to the specified length.
    • If the string is longer than the specified length, it truncates the string.
    • If the string is shorter, it pads the string with spaces to the right until it reaches the desired length.
  2. String Formatting:

    • Before writing to the CSV file, each string (pdfFileNamePrefix, name, globalEmailAddress, globalCompanyName, globalInvoiceNo, recordNumber, title) is passed through the formatString function to ensure it meets the required length.
  3. Fixed Lengths:

    • pdfFileNamePrefix: 60 characters
    • name: 40 characters
    • globalEmailAddress: 50 characters
    • globalCompanyName: 20 characters
    • globalInvoiceNo: 20 characters
    • recordNumber: 20 characters
    • title: 40 characters

This modification ensures that each field in your output CSV file has a fixed length, either by truncating or padding with spaces as necessary.

chriltola commented 1 month ago

From this code, i want to loop run all file in a specific folder by key in path and output result into one file name CLPEMINDEXDDMMYY.TXT which is DDMMYY is the current running date i.e 090824.

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
#include <iomanip>

// Global variables to store the email address, company name, and invoice number
std::string globalEmailAddress;
std::string globalCompanyName;
std::string globalInvoiceNo;

// Function to extract a substring between two delimiters
std::string extractBetween(const std::string& line, const std::string& startDelimiter, const std::string& endDelimiter) {
    std::size_t startPos = line.find(startDelimiter);
    if (startPos == std::string::npos) return "";
    startPos += startDelimiter.length();
    std::size_t endPos = line.find(endDelimiter, startPos);
    if (endPos == std::string::npos) return "";
    return line.substr(startPos, endPos - startPos);
}

// Function to extract data after a specific delimiter
std::string extractAfter(const std::string& line, const std::string& delimiter) {
    std::size_t pos = line.find(delimiter);
    if (pos == std::string::npos) return "";
    return line.substr(pos + delimiter.length());
}

// Function to format a string to a specific length
std::string formatString(const std::string& str, size_t length) {
    if (str.length() > length) {
        return str.substr(0, length);
    }
    else {
        return str + std::string(length - str.length(), ' ');
    }
}

void processDocumentSection(std::ifstream& inputFile, std::ofstream& outputFile) {
    std::string line;
    std::string pdfFileNamePrefix, name, recordNumber, title;

    while (std::getline(inputFile, line)) {
        if (line.find("</DOCUMENT>") != std::string::npos) {
            break;
        }
        if (line.find("<PDF.FILENAME.PREFIX>") != std::string::npos) {
            pdfFileNamePrefix = extractBetween(line, "<PDF.FILENAME.PREFIX>", "</PDF.FILENAME.PREFIX>");
        }
        else if (line.find("<NAME>") != std::string::npos) {
            name = extractBetween(line, "<NAME>", "</NAME>");
        }
        else if (line.find("Company/Customer's Name :") != std::string::npos) {
            globalCompanyName = extractAfter(line, "Company/Customer's Name :");
        }
        else if (line.find("Invoice No. :") != std::string::npos) {
            globalInvoiceNo = extractAfter(line, "Invoice No. :");
        }
        else if (line.find("<RECORDNUMBER>") != std::string::npos) {
            recordNumber = extractBetween(line, "<RECORDNUMBER>", "</RECORDNUMBER>");
        }
        else if (line.find("<TITLE>") != std::string::npos) {
            title = extractBetween(line, "<TITLE>", "</TITLE>");
        }
    }

    // Format each string to the specified length
    pdfFileNamePrefix = formatString(pdfFileNamePrefix, 60);
    name = formatString(name, 40);
    globalEmailAddress = formatString(globalEmailAddress, 50);
    globalCompanyName = formatString(globalCompanyName, 20);
    globalInvoiceNo = formatString(globalInvoiceNo, 20);
    recordNumber = formatString(recordNumber, 20);
    title = formatString(title, 40);

    // Write the formatted data to the CSV file
    outputFile << pdfFileNamePrefix
        << name
        << globalEmailAddress
        << globalCompanyName
        << globalInvoiceNo
        << recordNumber
        << title << '\n';
}

void extractDataAndWriteToCSV(const std::string& filePath, const std::string& outputCSVPath) {
    std::ifstream inputFile(filePath);
    std::ofstream outputFile(outputCSVPath);

    if (!inputFile.is_open()) {
        std::cerr << "Error opening file: " << filePath << std::endl;
        return;
    }
    if (!outputFile.is_open()) {
        std::cerr << "Error opening output file: " << outputCSVPath << std::endl;
        return;
    }

    //outputFile << "PDF.FILENAME.PREFIX,NAME,EMAIL.ADDRESS,Company/CustomerName,Invoice,RECORDNUMBER,TITLE\n"; //Header

    std::string line;
    while (std::getline(inputFile, line)) {
        // Step 1: Capture the email address
        if (line.find("<EMAIL.ADDRESS>") != std::string::npos) {
            globalEmailAddress = extractBetween(line, "<EMAIL.ADDRESS>", "</EMAIL.ADDRESS>");
        }
        // Step 2: Process each <DOCUMENT> section
        else if (line.find("<DOCUMENT>") != std::string::npos) {
            processDocumentSection(inputFile, outputFile);
        }
    }

    inputFile.close();
    outputFile.close();
}

int main() {
    const std::string filePath = "ENDT.1FF.NOTE.DMS.Y23FF20000991-001.20230810.105230"; // Replace with your file path
    const std::string outputCSVPath = "output.csv"; // Replace with desired output CSV file path

    extractDataAndWriteToCSV(filePath, outputCSVPath);

    std::cout << "Data extraction completed. Results written to " << outputCSVPath << std::endl;

    return 0;
}
chriltola commented 1 month ago

To modify your code to loop through all files in a specific folder and output the result into a single file named CLPEMINDEXDDMMYY.TXT, where DDMMYY is the current date, you'll need to:

  1. Loop through all files in a given directory.
  2. Open each file, process it, and append the result to the output file.
  3. Format the output file name with the current date.

Here is your updated code:

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
#include <iomanip>
#include <filesystem>
#include <ctime>

namespace fs = std::filesystem;

// Global variables to store the email address, company name, and invoice number
std::string globalEmailAddress;
std::string globalCompanyName;
std::string globalInvoiceNo;

// Function to extract a substring between two delimiters
std::string extractBetween(const std::string& line, const std::string& startDelimiter, const std::string& endDelimiter) {
    std::size_t startPos = line.find(startDelimiter);
    if (startPos == std::string::npos) return "";
    startPos += startDelimiter.length();
    std::size_t endPos = line.find(endDelimiter, startPos);
    if (endPos == std::string::npos) return "";
    return line.substr(startPos, endPos - startPos);
}

// Function to extract data after a specific delimiter
std::string extractAfter(const std::string& line, const std::string& delimiter) {
    std::size_t pos = line.find(delimiter);
    if (pos == std::string::npos) return "";
    return line.substr(pos + delimiter.length());
}

// Function to format a string to a specific length
std::string formatString(const std::string& str, size_t length) {
    if (str.length() > length) {
        return str.substr(0, length);
    }
    else {
        return str + std::string(length - str.length(), ' ');
    }
}

void processDocumentSection(std::ifstream& inputFile, std::ofstream& outputFile) {
    std::string line;
    std::string pdfFileNamePrefix, name, recordNumber, title;

    while (std::getline(inputFile, line)) {
        if (line.find("</DOCUMENT>") != std::string::npos) {
            break;
        }
        if (line.find("<PDF.FILENAME.PREFIX>") != std::string::npos) {
            pdfFileNamePrefix = extractBetween(line, "<PDF.FILENAME.PREFIX>", "</PDF.FILENAME.PREFIX>");
        }
        else if (line.find("<NAME>") != std::string::npos) {
            name = extractBetween(line, "<NAME>", "</NAME>");
        }
        else if (line.find("Company/Customer's Name :") != std::string::npos) {
            globalCompanyName = extractAfter(line, "Company/Customer's Name :");
        }
        else if (line.find("Invoice No. :") != std::string::npos) {
            globalInvoiceNo = extractAfter(line, "Invoice No. :");
        }
        else if (line.find("<RECORDNUMBER>") != std::string::npos) {
            recordNumber = extractBetween(line, "<RECORDNUMBER>", "</RECORDNUMBER>");
        }
        else if (line.find("<TITLE>") != std::string::npos) {
            title = extractBetween(line, "<TITLE>", "</TITLE>");
        }
    }

    // Format each string to the specified length
    pdfFileNamePrefix = formatString(pdfFileNamePrefix, 60);
    name = formatString(name, 40);
    globalEmailAddress = formatString(globalEmailAddress, 50);
    globalCompanyName = formatString(globalCompanyName, 20);
    globalInvoiceNo = formatString(globalInvoiceNo, 20);
    recordNumber = formatString(recordNumber, 20);
    title = formatString(title, 40);

    // Write the formatted data to the output file
    outputFile << pdfFileNamePrefix
        << name
        << globalEmailAddress
        << globalCompanyName
        << globalInvoiceNo
        << recordNumber
        << title << '\n';
}

void extractDataAndWriteToOutput(const std::string& filePath, std::ofstream& outputFile) {
    std::ifstream inputFile(filePath);

    if (!inputFile.is_open()) {
        std::cerr << "Error opening file: " << filePath << std::endl;
        return;
    }

    std::string line;
    while (std::getline(inputFile, line)) {
        // Step 1: Capture the email address
        if (line.find("<EMAIL.ADDRESS>") != std::string::npos) {
            globalEmailAddress = extractBetween(line, "<EMAIL.ADDRESS>", "</EMAIL.ADDRESS>");
        }
        // Step 2: Process each <DOCUMENT> section
        else if (line.find("<DOCUMENT>") != std::string::npos) {
            processDocumentSection(inputFile, outputFile);
        }
    }

    inputFile.close();
}

int main() {
    // Get the current date
    std::time_t t = std::time(nullptr);
    std::tm* now = std::localtime(&t);
    std::ostringstream oss;
    oss << std::setw(2) << std::setfill('0') << now->tm_mday
        << std::setw(2) << std::setfill('0') << (now->tm_mon + 1)
        << std::to_string(now->tm_year + 1900).substr(2);

    std::string outputFileName = "CLPEMINDEX" + oss.str() + ".TXT";

    // Open the output file
    std::ofstream outputFile(outputFileName);
    if (!outputFile.is_open()) {
        std::cerr << "Error opening output file: " << outputFileName << std::endl;
        return 1;
    }

    std::string directoryPath = "your_directory_path"; // Replace with your directory path

    // Loop through all files in the directory
    for (const auto& entry : fs::directory_iterator(directoryPath)) {
        if (entry.is_regular_file()) {
            extractDataAndWriteToOutput(entry.path().string(), outputFile);
        }
    }

    outputFile.close();

    std::cout << "Data extraction completed. Results written to " << outputFileName << std::endl;

    return 0;
}

Explanation:

  1. <filesystem> library: This library is used to iterate through all the files in a given directory.

  2. std::time_t and std::tm: These are used to get the current date and format it as DDMMYY.

  3. Output file name: The output file name is generated dynamically based on the current date.

  4. Directory iteration: The code iterates over all regular files in the specified directory and processes each file, appending the output to the CLPEMINDEXDDMMYY.TXT file.

Make sure to replace "your_directory_path" with the actual path to the directory containing your files.

chriltola commented 1 month ago

Severity Code Description Project File Line Suppression State Error (active) E0276 name followed by '::' must be a class or namespace name Index_epolicy D:\Project\Epolicy\Index_epolicy\Index_epolicy\Index_epolicy.cpp 135 Error (active) E0135 namespace "std" has no member "filesystem" Index_epolicy D:\Project\Epolicy\Index_epolicy\Index_epolicy\Index_epolicy.cpp 10

chriltola commented 1 month ago

If your compiler doesn't support std::filesystem, you can use alternative methods to list files in a directory, such as using system-specific APIs or libraries. However, if you are using an older C++ version that doesn't include std::filesystem, you can try using the POSIX API on Unix-based systems or the Windows API on Windows systems.

Here's an example using the POSIX API for Unix-based systems:

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
#include <iomanip>
#include <dirent.h>
#include <ctime>

// Global variables to store the email address, company name, and invoice number
std::string globalEmailAddress;
std::string globalCompanyName;
std::string globalInvoiceNo;

// Function to extract a substring between two delimiters
std::string extractBetween(const std::string& line, const std::string& startDelimiter, const std::string& endDelimiter) {
    std::size_t startPos = line.find(startDelimiter);
    if (startPos == std::string::npos) return "";
    startPos += startDelimiter.length();
    std::size_t endPos = line.find(endDelimiter, startPos);
    if (endPos == std::string::npos) return "";
    return line.substr(startPos, endPos - startPos);
}

// Function to extract data after a specific delimiter
std::string extractAfter(const std::string& line, const std::string& delimiter) {
    std::size_t pos = line.find(delimiter);
    if (pos == std::string::npos) return "";
    return line.substr(pos + delimiter.length());
}

// Function to format a string to a specific length
std::string formatString(const std::string& str, size_t length) {
    if (str.length() > length) {
        return str.substr(0, length);
    } else {
        return str + std::string(length - str.length(), ' ');
    }
}

void processDocumentSection(std::ifstream& inputFile, std::ofstream& outputFile) {
    std::string line;
    std::string pdfFileNamePrefix, name, recordNumber, title;

    while (std::getline(inputFile, line)) {
        if (line.find("</DOCUMENT>") != std::string::npos) {
            break;
        }
        if (line.find("<PDF.FILENAME.PREFIX>") != std::string::npos) {
            pdfFileNamePrefix = extractBetween(line, "<PDF.FILENAME.PREFIX>", "</PDF.FILENAME.PREFIX>");
        } else if (line.find("<NAME>") != std::string::npos) {
            name = extractBetween(line, "<NAME>", "</NAME>");
        } else if (line.find("Company/Customer's Name :") != std::string::npos) {
            globalCompanyName = extractAfter(line, "Company/Customer's Name :");
        } else if (line.find("Invoice No. :") != std::string::npos) {
            globalInvoiceNo = extractAfter(line, "Invoice No. :");
        } else if (line.find("<RECORDNUMBER>") != std::string::npos) {
            recordNumber = extractBetween(line, "<RECORDNUMBER>", "</RECORDNUMBER>");
        } else if (line.find("<TITLE>") != std::string::npos) {
            title = extractBetween(line, "<TITLE>", "</TITLE>");
        }
    }

    // Format each string to the specified length
    pdfFileNamePrefix = formatString(pdfFileNamePrefix, 60);
    name = formatString(name, 40);
    globalEmailAddress = formatString(globalEmailAddress, 50);
    globalCompanyName = formatString(globalCompanyName, 20);
    globalInvoiceNo = formatString(globalInvoiceNo, 20);
    recordNumber = formatString(recordNumber, 20);
    title = formatString(title, 40);

    // Write the formatted data to the output file
    outputFile << pdfFileNamePrefix
               << name
               << globalEmailAddress
               << globalCompanyName
               << globalInvoiceNo
               << recordNumber
               << title << '\n';
}

void extractDataFromFilesInFolder(const std::string& folderPath, const std::string& outputFileName) {
    std::ofstream outputFile(outputFileName, std::ios::out);

    if (!outputFile.is_open()) {
        std::cerr << "Error opening output file: " << outputFileName << std::endl;
        return;
    }

    DIR* dir;
    struct dirent* ent;
    if ((dir = opendir(folderPath.c_str())) != nullptr) {
        while ((ent = readdir(dir)) != nullptr) {
            std::string fileName = ent->d_name;
            // Skip "." and ".."
            if (fileName != "." && fileName != "..") {
                std::string filePath = folderPath + "/" + fileName;
                std::ifstream inputFile(filePath);

                if (inputFile.is_open()) {
                    std::string line;
                    while (std::getline(inputFile, line)) {
                        // Capture the email address
                        if (line.find("<EMAIL.ADDRESS>") != std::string::npos) {
                            globalEmailAddress = extractBetween(line, "<EMAIL.ADDRESS>", "</EMAIL.ADDRESS>");
                        }
                        // Process each <DOCUMENT> section
                        else if (line.find("<DOCUMENT>") != std::string::npos) {
                            processDocumentSection(inputFile, outputFile);
                        }
                    }
                    inputFile.close();
                } else {
                    std::cerr << "Error opening file: " << filePath << std::endl;
                }
            }
        }
        closedir(dir);
    } else {
        std::cerr << "Could not open directory: " << folderPath << std::endl;
    }

    outputFile.close();
}

std::string getCurrentDateFormatted() {
    std::time_t t = std::time(nullptr);
    std::tm tm = *std::localtime(&t);
    char buffer[7];
    std::strftime(buffer, sizeof(buffer), "%d%m%y", &tm);
    return std::string(buffer);
}

int main() {
    const std::string folderPath = "path/to/your/folder"; // Replace with your folder path
    const std::string currentDate = getCurrentDateFormatted();
    const std::string outputFileName = "CLPEMINDEX" + currentDate + ".TXT";

    extractDataFromFilesInFolder(folderPath, outputFileName);

    std::cout << "Data extraction completed. Results written to " << outputFileName << std::endl;

    return 0;
}

Explanation:

  1. Directory Traversal with POSIX API:

    • The opendir, readdir, and closedir functions are used to traverse the directory and list all files.
    • The code checks each file in the directory and skips the "." and ".." entries.
  2. Compatibility:

    • This approach works on Unix-based systems, including Linux and macOS.
  3. Integration:

    • The rest of the logic remains the same as before, processing each file and writing the results to a single output file.

Using the Code:

chriltola commented 1 month ago

From this code,i want to make user keyin the file name and out put is the name of file which keyin with extension .TXT

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
#include <iomanip>

// Global variables to store the email address, company name, and invoice number
std::string globalEmailAddress;
std::string globalCompanyName;
std::string globalInvoiceNo;

// Function to extract a substring between two delimiters
std::string extractBetween(const std::string& line, const std::string& startDelimiter, const std::string& endDelimiter) {
    std::size_t startPos = line.find(startDelimiter);
    if (startPos == std::string::npos) return "";
    startPos += startDelimiter.length();
    std::size_t endPos = line.find(endDelimiter, startPos);
    if (endPos == std::string::npos) return "";
    return line.substr(startPos, endPos - startPos);
}

// Function to extract data after a specific delimiter
std::string extractAfter(const std::string& line, const std::string& delimiter) {
    std::size_t pos = line.find(delimiter);
    if (pos == std::string::npos) return "";
    return line.substr(pos + delimiter.length());
}

// Function to format a string to a specific length
std::string formatString(const std::string& str, size_t length) {
    if (str.length() > length) {
        return str.substr(0, length);
    }
    else {
        return str + std::string(length - str.length(), ' ');
    }
}

void processDocumentSection(std::ifstream& inputFile, std::ofstream& outputFile) {
    std::string line;
    std::string pdfFileNamePrefix, name, recordNumber, title;

    while (std::getline(inputFile, line)) {
        if (line.find("</DOCUMENT>") != std::string::npos) {
            break;
        }
        if (line.find("<PDF.FILENAME.PREFIX>") != std::string::npos) {
            pdfFileNamePrefix = extractBetween(line, "<PDF.FILENAME.PREFIX>", "</PDF.FILENAME.PREFIX>");
        }
        else if (line.find("<NAME>") != std::string::npos) {
            name = extractBetween(line, "<NAME>", "</NAME>");
        }
        else if (line.find("Company/Customer's Name :") != std::string::npos) {
            globalCompanyName = extractAfter(line, "Company/Customer's Name :");
        }
        else if (line.find("Invoice No. :") != std::string::npos) {
            globalInvoiceNo = extractAfter(line, "Invoice No. :");
        }
        else if (line.find("<RECORDNUMBER>") != std::string::npos) {
            recordNumber = extractBetween(line, "<RECORDNUMBER>", "</RECORDNUMBER>");
        }
        else if (line.find("<TITLE>") != std::string::npos) {
            title = extractBetween(line, "<TITLE>", "</TITLE>");
        }
    }

    // Format each string to the specified length
    pdfFileNamePrefix = formatString(pdfFileNamePrefix, 60);
    name = formatString(name, 40);
    globalEmailAddress = formatString(globalEmailAddress, 50);
    globalCompanyName = formatString(globalCompanyName, 20);
    globalInvoiceNo = formatString(globalInvoiceNo, 20);
    recordNumber = formatString(recordNumber, 20);
    title = formatString(title, 40);

    // Write the formatted data to the CSV file
    outputFile << pdfFileNamePrefix
        << name
        << globalEmailAddress
        << globalCompanyName
        << globalInvoiceNo
        << recordNumber
        << title << '\n';
}

void extractDataAndWriteToCSV(const std::string& filePath, const std::string& outputCSVPath) {
    std::ifstream inputFile(filePath);
    std::ofstream outputFile(outputCSVPath);

    if (!inputFile.is_open()) {
        std::cerr << "Error opening file: " << filePath << std::endl;
        return;
    }
    if (!outputFile.is_open()) {
        std::cerr << "Error opening output file: " << outputCSVPath << std::endl;
        return;
    }

    //outputFile << "PDF.FILENAME.PREFIX,NAME,EMAIL.ADDRESS,Company/CustomerName,Invoice,RECORDNUMBER,TITLE\n"; //Header

    std::string line;
    while (std::getline(inputFile, line)) {
        // Step 1: Capture the email address
        if (line.find("<EMAIL.ADDRESS>") != std::string::npos) {
            globalEmailAddress = extractBetween(line, "<EMAIL.ADDRESS>", "</EMAIL.ADDRESS>");
        }
        // Step 2: Process each <DOCUMENT> section
        else if (line.find("<DOCUMENT>") != std::string::npos) {
            processDocumentSection(inputFile, outputFile);
        }
    }

    inputFile.close();
    outputFile.close();
}

int main() {
    const std::string filePath = "ENDT.1FF.NOTE.DMS.Y23FF20000991-001.20230810.105230"; // Replace with your file path
    const std::string outputCSVPath = "output.csv"; // Replace with desired output CSV file path

    extractDataAndWriteToCSV(filePath, outputCSVPath);

    std::cout << "Data extraction completed. Results written to " << outputCSVPath << std::endl;

    return 0;
}
chriltola commented 1 month ago

To modify the code so that the user can key in the file name and the output is that file name with the .TXT extension, you can use std::cin to get the file name from the user and then append .TXT to it. Below is the modified version of your code:

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
#include <iomanip>

// Global variables to store the email address, company name, and invoice number
std::string globalEmailAddress;
std::string globalCompanyName;
std::string globalInvoiceNo;

// Function to extract a substring between two delimiters
std::string extractBetween(const std::string& line, const std::string& startDelimiter, const std::string& endDelimiter) {
    std::size_t startPos = line.find(startDelimiter);
    if (startPos == std::string::npos) return "";
    startPos += startDelimiter.length();
    std::size_t endPos = line.find(endDelimiter, startPos);
    if (endPos == std::string::npos) return "";
    return line.substr(startPos, endPos - startPos);
}

// Function to extract data after a specific delimiter
std::string extractAfter(const std::string& line, const std::string& delimiter) {
    std::size_t pos = line.find(delimiter);
    if (pos == std::string::npos) return "";
    return line.substr(pos + delimiter.length());
}

// Function to format a string to a specific length
std::string formatString(const std::string& str, size_t length) {
    if (str.length() > length) {
        return str.substr(0, length);
    }
    else {
        return str + std::string(length - str.length(), ' ');
    }
}

void processDocumentSection(std::ifstream& inputFile, std::ofstream& outputFile) {
    std::string line;
    std::string pdfFileNamePrefix, name, recordNumber, title;

    while (std::getline(inputFile, line)) {
        if (line.find("</DOCUMENT>") != std::string::npos) {
            break;
        }
        if (line.find("<PDF.FILENAME.PREFIX>") != std::string::npos) {
            pdfFileNamePrefix = extractBetween(line, "<PDF.FILENAME.PREFIX>", "</PDF.FILENAME.PREFIX>");
        }
        else if (line.find("<NAME>") != std::string::npos) {
            name = extractBetween(line, "<NAME>", "</NAME>");
        }
        else if (line.find("Company/Customer's Name :") != std::string::npos) {
            globalCompanyName = extractAfter(line, "Company/Customer's Name :");
        }
        else if (line.find("Invoice No. :") != std::string::npos) {
            globalInvoiceNo = extractAfter(line, "Invoice No. :");
        }
        else if (line.find("<RECORDNUMBER>") != std::string::npos) {
            recordNumber = extractBetween(line, "<RECORDNUMBER>", "</RECORDNUMBER>");
        }
        else if (line.find("<TITLE>") != std::string::npos) {
            title = extractBetween(line, "<TITLE>", "</TITLE>");
        }
    }

    // Format each string to the specified length
    pdfFileNamePrefix = formatString(pdfFileNamePrefix, 60);
    name = formatString(name, 40);
    globalEmailAddress = formatString(globalEmailAddress, 50);
    globalCompanyName = formatString(globalCompanyName, 20);
    globalInvoiceNo = formatString(globalInvoiceNo, 20);
    recordNumber = formatString(recordNumber, 20);
    title = formatString(title, 40);

    // Write the formatted data to the CSV file
    outputFile << pdfFileNamePrefix
        << name
        << globalEmailAddress
        << globalCompanyName
        << globalInvoiceNo
        << recordNumber
        << title << '\n';
}

void extractDataAndWriteToCSV(const std::string& filePath, const std::string& outputCSVPath) {
    std::ifstream inputFile(filePath);
    std::ofstream outputFile(outputCSVPath);

    if (!inputFile.is_open()) {
        std::cerr << "Error opening file: " << filePath << std::endl;
        return;
    }
    if (!outputFile.is_open()) {
        std::cerr << "Error opening output file: " << outputCSVPath << std::endl;
        return;
    }

    //outputFile << "PDF.FILENAME.PREFIX,NAME,EMAIL.ADDRESS,Company/CustomerName,Invoice,RECORDNUMBER,TITLE\n"; //Header

    std::string line;
    while (std::getline(inputFile, line)) {
        // Step 1: Capture the email address
        if (line.find("<EMAIL.ADDRESS>") != std::string::npos) {
            globalEmailAddress = extractBetween(line, "<EMAIL.ADDRESS>", "</EMAIL.ADDRESS>");
        }
        // Step 2: Process each <DOCUMENT> section
        else if (line.find("<DOCUMENT>") != std::string::npos) {
            processDocumentSection(inputFile, outputFile);
        }
    }

    inputFile.close();
    outputFile.close();
}

int main() {
    std::string filePath;
    std::string outputCSVPath;

    // Prompt the user to enter the file name
    std::cout << "Enter the file name: ";
    std::getline(std::cin, filePath);

    // Append .TXT extension to the file name
    outputCSVPath = filePath + ".TXT";

    // Call the function to extract data and write to the output file
    extractDataAndWriteToCSV(filePath, outputCSVPath);

    std::cout << "Data extraction completed. Results written to " << outputCSVPath << std::endl;

    return 0;
}

Key Changes:

  1. User Input for File Name: The program now asks the user to enter the file name using std::getline(std::cin, filePath);.
  2. Appending .TXT: The .TXT extension is appended to the file name entered by the user (outputCSVPath = filePath + ".TXT";).
  3. File Handling: The input and output paths are handled dynamically based on the user input.

When you run this program, it will prompt you to enter the name of the file. The output file will be created with the name provided by the user, suffixed with .TXT.

chriltola commented 1 month ago

You can create a batch script to execute every file in a folder using the .exe compiled from your C++ code. The script will iterate over each file in the folder and run the .exe for each one. Here's how you can do it:

@echo off
REM Batch script to execute every file in a folder using the specified exe

REM Set the path to the folder containing the files
set "folderPath=C:\path\to\your\folder"
REM Set the path to the exe file
set "exePath=C:\path\to\your\executable.exe"

REM Iterate over each file in the folder
for %%f in ("%folderPath%\*.*") do (
    REM Execute the exe with the current file as input
    "%exePath%" "%%f"
)

echo All files have been processed.
pause

Explanation:

  1. Setting Paths:

    • folderPath: This variable holds the path to the folder containing the files you want to process.
    • exePath: This variable holds the path to the .exe file generated from your C++ code.
  2. Looping Through Files:

    • for %%f in ("%folderPath%\*.*"): This loop iterates over all files (*.*) in the specified folder. Each file is represented by %%f.
  3. Executing the EXE:

    • The line "%exePath%" "%%f" runs the .exe file with the current file (%%f) as an argument. The quotes around %%f handle any spaces in the file paths.
  4. Final Message:

    • echo All files have been processed. provides a message after all files are processed.
    • pause waits for user input before closing the command prompt, so you can see the final message.

Usage:

  1. Copy the script into a text file.
  2. Save the file with a .bat extension (e.g., process_files.bat).
  3. Edit the folderPath and exePath variables to match your setup.
  4. Double-click the .bat file to run it.

This script will execute your .exe on every file in the specified folder.

chriltola commented 1 month ago

You can create a batch script that loops through all .TXT files in a folder and combines them into a single file named CLPEMINDEXDDMMYY.TXT, where DDMMYY is the current date. Below is the script to accomplish this:

@echo off
REM Batch script to combine all .TXT files into one file with current date in the name

REM Set the path to the folder containing the .TXT files
set "folderPath=C:\path\to\your\folder"

REM Get the current date in DDMMYY format
for /f "tokens=2-4 delims=/ " %%a in ('date /t') do set currentDate=%%a%%b%%c

REM Set the output file name with the current date
set "outputFile=CLPEMINDEX%currentDate%.TXT"

REM Combine all .TXT files in the folder into the output file
copy /b "%folderPath%\*.TXT" "%folderPath%\%outputFile%"

echo All .TXT files have been combined into %outputFile%.
pause

Explanation:

  1. Setting the Folder Path:

    • set "folderPath=C:\path\to\your\folder": This sets the variable folderPath to the path of the folder containing the .TXT files.
  2. Getting the Current Date:

    • The for /f loop extracts the current date in DDMMYY format. It uses the system date /t command and rearranges the date format to fit your needs.
  3. Setting the Output File Name:

    • set "outputFile=CLPEMINDEX%currentDate%.TXT": This creates the output file name using the current date.
  4. Combining the Files:

    • copy /b "%folderPath%\*.TXT" "%folderPath%\%outputFile%": This copies all .TXT files in the folder and combines them into the single output file named CLPEMINDEXDDMMYY.TXT.
  5. Final Message:

    • The script echoes a confirmation message and pauses to keep the command window open until you press a key.

Usage:

  1. Copy the script into a text file.
  2. Save the file with a .bat extension (e.g., combine_txt_files.bat).
  3. Edit the folderPath variable to match your folder's path.
  4. Double-click the .bat file to run it.

This script will create a combined .TXT file named `CL

chriltola commented 1 month ago

To correctly format the date as DDMMYY (i.e., 090824 for August 9th, 2024), you'll need to adjust the batch script to correctly extract and format the date components. Here’s the revised script:

@echo off
REM Batch script to combine all .TXT files into one file with current date in the name

REM Set the path to the folder containing the .TXT files
set "folderPath=C:\path\to\your\folder"

REM Get the current date in DDMMYY format
for /f "tokens=1-3 delims=/" %%a in ('echo %date%') do (
    set day=%%a
    set month=%%b
    set year=%%c
)

REM Format the year as the last two digits
set year=%year:~-2%

REM Set the output file name with the current date
set "outputFile=CLPEMINDEX%day%%month%%year%.TXT"

REM Combine all .TXT files in the folder into the output file
copy /b "%folderPath%\*.TXT" "%folderPath%\%outputFile%"

echo All .TXT files have been combined into %outputFile%.
pause

Explanation of the Key Changes:

  1. Date Extraction:

    • The for /f loop now correctly extracts the day, month, and year from the date command output.
    • set day=%%a, set month=%%b, and set year=%%c store the day, month, and year parts.
  2. Year Formatting:

    • set year=%year:~-2% ensures that only the last two digits of the year are used.
  3. Output File Name:

    • The output file is named using the day, month, and year variables, resulting in the correct DDMMYY format.

Usage:

  1. Ensure the folderPath is set correctly.
  2. Save the script with a .bat extension (e.g., combine_txt_files.bat).
  3. Run the script.

This will create the combined .TXT file named CLPEMINDEX090824.TXT as expected.