tianshilu commented 3 years ago

Hi @schristley @acrinklaw ,

I git clone the repository and tried to compile it. cmake . runs successfully. But there is an error message when I execute cmake --build . (shown below)

My cmake version is 3.20.4. Could you give me a hint on how to fix the problem?

Thanks!

acrinklaw commented 3 years ago

Hi @tianshilu, what version of gcc do you have? This is most likely an issue of having a different version of gcc (which implements openmp) than is required.

tianshilu commented 3 years ago

Hi @acrinklaw Thanks for your prompt response. I attached my gcc version.

acrinklaw commented 3 years ago

No problem at all. I would suggest trying to update your gcc to the highest you can on ubuntu 18.04, when I wrote this I was using gcc 9.X. If it still gives you issues I will troubleshoot. I will also add some logic to CMake to make sure that everything is compatible for future users. Thanks!

tianshilu commented 3 years ago

Thank you for your suggestion. I will upgrade gcc. Will keep you posted. Appreciate your help!

tianshilu commented 3 years ago

I have one additional question. I tried to download the package https://github.com/IEDB/TCRMatch/releases/download/v0.1.0/TCRMatch_v0.1.0_Linux.tar.gz https://github.com/IEDB/TCRMatch/releases/download/v0.1.0/TCRMatch_v0.1.0_Linux.tar.gz which is already complied. I could not execute it and the error message is ‘segmentation fault (core dump)’. Do you know what the problem is?

On Jun 16, 2021, at 11:59 PM, Austin Crinklaw @.***> wrote:

EXTERNAL MAIL

No problem at all. I would suggest trying to update your gcc to the highest you can on ubuntu 18.04, when I wrote this I was using gcc 9.X. If it still gives you issues I will troubleshoot. I will also add some logic to CMake to make sure that everything is compatible for future users. Thanks!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/IEDB/TCRMatch/issues/8#issuecomment-862925326, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AH5QQKNLQMQVWWB52KLXWETTTF6MRANCNFSM4622DUZA. CAUTION: This email originated from outside UTSW. Please be cautious of links or attachments, and validate the sender's email address before replying.

UT Southwestern

Medical Center

The future of medicine, today.

acrinklaw commented 3 years ago

Yeah I believe that is because I compiled that binary in a different environment than you have. It was a mistake on my part - I'm still learning C++ development practices :-) It is getting late here but tomorrow I will see how people go about properly sharing binaries. For now I still suggest trying to upgrade gcc and compiling TCRMatch

tianshilu commented 3 years ago

I upgraded gcc to 9.4.0 and compiled TCRMatch. It seems the same error "invalid controlling predicate" existed. Do you have other suggestions to fix the problem?

acrinklaw commented 3 years ago

Hmm that is quite strange. The error is suggesting something is wrong with the parallelized for loops that are using openMP. I am unable to recreate it locally. Would you mind attaching the code from tcrmatch.cpp? In addition can you attach the output from echo |cpp -fopenmp -dM |grep -i open? Thank you

tianshilu commented 3 years ago

The output is: #define _OPENMP 201511 It doesn't allow me attach the *cpp file here. So I pasted the code here. Let me know if it is not convenient for you. I can send you an email. Thanks for your help!

include

std::array<std::array<float, 20>, 20> k1; int p_kmin = 1; int p_kmax = 30; float p_beta = 0.11387; // Hardcoded because parsing + computing matrix is annoying float blm_qij[20][20] = { {0.0215, 0.0023, 0.0019, 0.0022, 0.0016, 0.0019, 0.003, 0.0058, 0.0011, 0.0032, 0.0044, 0.0033, 0.0013, 0.0016, 0.0022, 0.0063, 0.0037, 0.0004, 0.0013, 0.0051}, {0.0023, 0.0178, 0.002, 0.0016, 0.0004, 0.0025, 0.0027, 0.0017, 0.0012, 0.0012, 0.0024, 0.0062, 0.0008, 0.0009, 0.001, 0.0023, 0.0018, 0.0003, 0.0009, 0.0016}, {0.0019, 0.002, 0.0141, 0.0037, 0.0004, 0.0015, 0.0022, 0.0029, 0.0014, 0.001, 0.0014, 0.0024, 0.0005, 0.0008, 0.0009, 0.0031, 0.0022, 0.0002, 0.0007, 0.0012}, {0.0022, 0.0016, 0.0037, 0.0213, 0.0004, 0.0016, 0.0049, 0.0025, 0.001, 0.0012, 0.0015, 0.0024, 0.0005, 0.0008, 0.0012, 0.0028, 0.0019, 0.0002, 0.0006, 0.0013}, {0.0016, 0.0004, 0.0004, 0.0004, 0.0119, 0.0003, 0.0004, 0.0008, 0.0002, 0.0011, 0.0016, 0.0005, 0.0004, 0.0005, 0.0004, 0.001, 0.0009, 0.0001, 0.0003, 0.0014}, {0.0019, 0.0025, 0.0015, 0.0016, 0.0003, 0.0073, 0.0035, 0.0014, 0.001, 0.0009, 0.0016, 0.0031, 0.0007, 0.0005, 0.0008, 0.0019, 0.0014, 0.0002, 0.0007, 0.0012}, {0.003, 0.0027, 0.0022, 0.0049, 0.0004, 0.0035, 0.0161, 0.0019, 0.0014, 0.0012, 0.002, 0.0041, 0.0007, 0.0009, 0.0014, 0.003, 0.002, 0.0003, 0.0009, 0.0017}, {0.0058, 0.0017, 0.0029, 0.0025, 0.0008, 0.0014, 0.0019, 0.0378, 0.001, 0.0014, 0.0021, 0.0025, 0.0007, 0.0012, 0.0014, 0.0038, 0.0022, 0.0004, 0.0008, 0.0018}, {0.0011, 0.0012, 0.0014, 0.001, 0.0002, 0.001, 0.0014, 0.001, 0.0093, 0.0006, 0.001, 0.0012, 0.0004, 0.0008, 0.0005, 0.0011, 0.0007, 0.0002, 0.0015, 0.0006}, {0.0032, 0.0012, 0.001, 0.0012, 0.0011, 0.0009, 0.0012, 0.0014, 0.0006, 0.0184, 0.0114, 0.0016, 0.0025, 0.003, 0.001, 0.0017, 0.0027, 0.0004, 0.0014, 0.012}, {0.0044, 0.0024, 0.0014, 0.0015, 0.0016, 0.0016, 0.002, 0.0021, 0.001, 0.0114, 0.0371, 0.0025, 0.0049, 0.0054, 0.0014, 0.0024, 0.0033, 0.0007, 0.0022, 0.0095}, {0.0033, 0.0062, 0.0024, 0.0024, 0.0005, 0.0031, 0.0041, 0.0025, 0.0012, 0.0016, 0.0025, 0.0161, 0.0009, 0.0009, 0.0016, 0.0031, 0.0023, 0.0003, 0.001, 0.0019}, {0.0013, 0.0008, 0.0005, 0.0005, 0.0004, 0.0007, 0.0007, 0.0007, 0.0004, 0.0025, 0.0049, 0.0009, 0.004, 0.0012, 0.0004, 0.0009, 0.001, 0.0002, 0.0006, 0.0023}, {0.0016, 0.0009, 0.0008, 0.0008, 0.0005, 0.0005, 0.0009, 0.0012, 0.0008, 0.003, 0.0054, 0.0009, 0.0012, 0.0183, 0.0005, 0.0012, 0.0012, 0.0008, 0.0042, 0.0026}, {0.0022, 0.001, 0.0009, 0.0012, 0.0004, 0.0008, 0.0014, 0.0014, 0.0005, 0.001, 0.0014, 0.0016, 0.0004, 0.0005, 0.0191, 0.0017, 0.0014, 0.0001, 0.0005, 0.0012}, {0.0063, 0.0023, 0.0031, 0.0028, 0.001, 0.0019, 0.003, 0.0038, 0.0011, 0.0017, 0.0024, 0.0031, 0.0009, 0.0012, 0.0017, 0.0126, 0.0047, 0.0003, 0.001, 0.0024}, {0.0037, 0.0018, 0.0022, 0.0019, 0.0009, 0.0014, 0.002, 0.0022, 0.0007, 0.0027, 0.0033, 0.0023, 0.001, 0.0012, 0.0014, 0.0047, 0.0125, 0.0003, 0.0009, 0.0036}, {0.0004, 0.0003, 0.0002, 0.0002, 0.0001, 0.0002, 0.0003, 0.0004, 0.0002, 0.0004, 0.0007, 0.0003, 0.0002, 0.0008, 0.0001, 0.0003, 0.0003, 0.0065, 0.0009, 0.0004}, {0.0013, 0.0009, 0.0007, 0.0006, 0.0003, 0.0007, 0.0009, 0.0008, 0.0015, 0.0014, 0.0022, 0.001, 0.0006, 0.0042, 0.0005, 0.001, 0.0009, 0.0009, 0.0102, 0.0015}, {0.0051, 0.0016, 0.0012, 0.0013, 0.0014, 0.0012, 0.0017, 0.0018, 0.0006, 0.012, 0.0095, 0.0019, 0.0023, 0.0026, 0.0012, 0.0024, 0.0036, 0.0004, 0.0015, 0.0196}};

struct peptide { std::string seq; int len; float aff; std::vector i; };

std::vector read_IEDB_data() { std::vector iedb_data; std::ifstream iedb_file("data/IEDB_data.tsv"); std::string line; while (getline(iedb_file, line)) { std::stringstream ss(line); std::string sequence; ss >> sequence; if (sequence != "trimmed_seq") { iedb_data.push_back(sequence); } } return iedb_data; }

std::array<std::array<float, 20>, 20> fmatrix_k1() { // Calculates the modified (normalized) blosum62 matrix

int k, j; float marg[20]; float sum;

// initialize margin array for (int i = 0; i < 20; i++) { marg[i] = 0.0; } // initialize k1 for (int i = 0; i < 20; i++) { for (int j = 0; j < 20; j++) { k1[i][j] = 0.0; } } // normalize matrix by marginal frequencies for (j = 0; j < 20; j++) { sum = 0; for (k = 0; k < 20; k++) sum += blm_qij[j][k]; marg[j] = sum; } // calculate K1 for (j = 0; j < 20; j++) { for (k = 0; k < 20; k++) { k1[j][k] = blm_qij[j][k] / (marg[j] * marg[k]); k1[j][k] = pow(k1[j][k], p_beta); } }

return (k1); }

float k3_sum(peptide pep1, peptide pep2) { // Recursively calculate Kernel 3 using Kernel 1 lookups float k2, term, k3 = 0.0; int start1, start2; int k, j1, j2;

float k2_prod_save[31][31][31];

for (k = p_kmin; k <= p_kmax; k++) { for (start1 = 0; start1 <= pep1.len - k; start1++) { for (start2 = 0; start2 <= pep2.len - k; start2++) {

    j1 = pep1.i[start1 + k - 1];
    j2 = pep2.i[start2 + k - 1];
    term = k1[j1][j2];

    if (k == 1) {
      k2 = term;
    } else {
      k2 = k2_prod_save[k - 1][start1][start2] * term;
    }

    k2_prod_save[k][start1][start2] = k2;
    k3 += k2;
  }
}

} return (k3); }

void multi_calc_k3(std::vector peplist1, std::vector peplist2, float threshold) { // Simple method to calculate pairwise TCRMatch scores using two peptide // vectors std::vector<std::tuple<std::string, std::string, float>> results[omp_get_max_threads()];

pragma omp parallel for

for (int i = 0; i < peplist1.size(); i++) { for (int j = 0; j < peplist2.size(); j++) { peptide pep1 = peplist1[i]; peptide pep2 = peplist2[j]; float score = 0.0; score = k3_sum(pep1, pep2) / sqrt(pep1.aff * pep2.aff); if (score > threshold) { int tid = omp_get_thread_num(); results[tid].push_back(make_tuple(pep1.seq, pep2.seq, score)); } } } for (int i = 0; i < omp_get_max_threads(); i++) { for (auto &tuple : results[i]) { std::cout << std::fixed << std::setprecision(2) << std::get<0>(tuple) << " " << std::get<1>(tuple) << " " << std::get<2>(tuple) << std::endl; } } }

// Move this to outside -> import everything you need int main(int argc, char *argv[]) { int opt; int n_threads; float threshold; std::string in_file; int i_flag = -1; int t_flag = -1; int thresh_flag = -1;

// Command line argument parsing while ((opt = getopt(argc, argv, "t:i:s:")) != -1) { switch (opt) { case 't': n_threads = atoi(optarg); t_flag = 1; break; case 'i': in_file = optarg; i_flag = 1; break; case 's': threshold = std::stof(optarg); thresh_flag = 1; break; default: std::cerr << "Usage: ./tcrmatch -i infile_name.txt -t num_threads -s score_threshold" << std::endl; return EXIT_FAILURE; } } // Check that required parameters are there if (i_flag == -1 || t_flag == -1) { std::cerr << "Missing mandatory parameters" << std::endl << "Usage: ./tcrmatch -i infile_name.txt -t num_threads" << std::endl; return EXIT_FAILURE; } if (thresh_flag == -1) { threshold = .97; }

std::vector iedb_data = read_IEDB_data(); std::ifstream file1(in_file); std::string line; std::string alphabet; std::vector peplist1; std::vector peplist2;

omp_set_num_threads(n_threads);

alphabet = "ARNDCQEGHILKMFPSTWYV"; k1 = fmatrix_k1(); while (getline(file1, line)) { std::vector int_vec; for (int i = 0; i < line.length(); i++) { if (alphabet.find(line[i]) == -1) { std::cerr << "Invalid amino acid found in " << line << " at position " << i + 1 << std::endl; return EXIT_FAILURE; } } peplist1.push_back({line, int(line.length()), -99.9, int_vec}); } file1.close();

// Calculate the normalization score (aff) (kernel 3 self vs self) list 1

pragma omp parallel for

for (std::vector::iterator it = peplist1.begin(); it != peplist1.end(); it++) { for (int x = 0; x < it->len; x++) { it->i.push_back(alphabet.find(it->seq[x])); } it->aff = k3_sum(it, it); }

// change to IEDB data for (std::vector::iterator it = iedb_data.begin(); it != iedb_data.end(); it++) { std::vector int_vec; for (int i = 0; i < (it).length(); i++) { if (alphabet.find((it)[i]) == -1) { std::cerr << "Invalid amino acid found in " << it << " at position " << i + 1 << std::endl; return EXIT_FAILURE; } } peplist2.push_back({it, int((*it).length()), -99.9, int_vec}); }

// Calculate the normalization score (aff) (kernel 3 self vs self) for list 2

pragma omp parallel for

for (std::vector::iterator it = peplist2.begin(); it != peplist2.end(); it++) { for (int x = 0; x < it->len; x++) { it->i.push_back(alphabet.find(it->seq[x])); } it->aff = k3_sum(it, it); } multi_calc_k3(peplist1, peplist2, threshold);

return 0; }

acrinklaw commented 3 years ago

So it looks like the code is the same as what I am compiling. And your openMP specification is 4.5 which is also what I used. I think a last step before I do a deep dive into why this might be occurring is to try building without cmake.

Can you try g++ -fopenmp -O3 src/tcrmatch.cpp -o tcrmatch and see if that builds? Then if you do ./tcrmatch you should get Missing mandatory parameters Usage: ./tcrmatch -i infile_name.txt -t num_threads

tianshilu commented 3 years ago

Thank you for your response! I got the same error.

schristley commented 3 years ago

Not sure if it relevant, but to compile on our HPC system here I had set CC and CXX to get cmake to use the right compiler otherwise it kept using the super-old one in /bin

$ module load gcc/9.1.0
$ export CC=/opt/apps/gcc/9.1.0/bin/gcc
$ export CXX=/opt/apps/gcc/9.1.0/bin/g++
$ cmake .
$ cmake --build .
Scanning dependencies of target tcrmatch
[ 50%] Building CXX object CMakeFiles/tcrmatch.dir/src/tcrmatch.cpp.o
[100%] Linking CXX executable tcrmatch
[100%] Built target tcrmatch

$ ./tcrmatch 
Missing mandatory parameters
Usage: ./tcrmatch -i infile_name.txt -t num_threads

acrinklaw commented 3 years ago

Thanks @schristley. That may be the issue. @tianshilu can you verify which version of g++ is being used? g++ --version. If it is using the proper version (9.X) and you still get the error I can do one of two things, I can either provide you with the Python version that has slightly slower run time, or I can provide you with a Dockerfile that I know will run.

schristley commented 3 years ago

If you do a google for "pragma omp parallel for invalid controlling predicate" then you will also find various discussions. It's also possible that openmp doesn't like the the C++ style for iterator, maybe it isn't "basic" enough.

  for (std::vector<peptide>::iterator it = peplist1.begin();

Not sure why it would work for me though

acrinklaw commented 3 years ago

If you do a google for "pragma omp parallel for invalid controlling predicate" then you will also find various discussions. It's also possible that openmp doesn't like the the C++ style for iterator, maybe it isn't "basic" enough.
  for (std::vector<peptide>::iterator it = peplist1.begin();
Not sure why it would work for me though

Yeah that's why I'm confused as well, when I was first writing this it seemed that openMP 4.5 supported iterators as well as the != operator, which it didn't in the past. And I'm having trouble recreating this since it compiles and works fine using a very similar environment

schristley commented 3 years ago

Ah, I bet it's this:

it != peplist1.end();

The != is not a valid operator.

schristley commented 3 years ago

Yeah that's why I'm confused as well, when I was first writing this it seemed that openMP 4.5 supported iterators as well as the != operator, which it didn't in the past. And I'm having trouble recreating this since it compiles and works fine using a very similar environment

Hmm, ok, then it's likely something with the environment...

acrinklaw commented 3 years ago

I have a feeling it is because of g++ versioning. @tianshilu please make sure g++ is up-to-date and 9.X. If this still does not work, email me at acrinklaw@lji.org and I will send you the Python version or Dockerfile, whichever one is easiest for you. Thanks

tianshilu commented 3 years ago

Thank you so much for your responses, suggestions, and help @schristley @acrinklaw. You are totally right. It is the problem of g++ versioning. I am trying to update g++ on my machine.

IEDB / TCRMatch

compile error #8

include

include

include

include

include

include

include

include

include

include

pragma omp parallel for

pragma omp parallel for

pragma omp parallel for