eriqande / snppit

Program for large scale parent pair inference (in salmon, etc.)
6 stars 2 forks source link

Compilation error multiple definition #5

Open ckastall opened 2 years ago

ckastall commented 2 years ago

Hi,

I've been trying to compile snppit under linux (gcc 12.2.0) by running Compile_snppit.sh.

This failed with these errors:

src/pbt_C_fb.c: In function ‘ForwardStep’:
src/pbt_C_fb.c:250:48: warning: argument 1 range [18446744071562067968, 18446744073709551615] exceeds maximum object size 9223372036854775807 [-Walloc-size-larger-than=]
  250 |                 ret->FB[t] = (struct fb_cell *)calloc(ret->Kt[t], sizeof(struct fb_cell));
      |                                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from src/pbt_C_fb.c:4:
/usr/include/stdlib.h:556:14: note: in a call to allocation function ‘calloc’ declared here
  556 | extern void *calloc (size_t __nmemb, size_t __size)
      |              ^~~~~~
/usr/bin/ld: /tmp/ccUQCvT1.o:(.bss+0x0): multiple definition of `gHasSnpSumPedPath'; /tmp/ccNqPloF.o:(.bss+0x0): first defined here
/usr/bin/ld: /tmp/ccUQCvT1.o:(.bss+0x20): multiple definition of `gSnpSumPedPath'; /tmp/ccNqPloF.o:(.bss+0x20): first defined here
/usr/bin/ld: /tmp/ccJt2mTE.o:(.bss+0x0): multiple definition of `gHasSnpSumPedPath'; /tmp/ccNqPloF.o:(.bss+0x0): first defined here
/usr/bin/ld: /tmp/ccJt2mTE.o:(.bss+0x20): multiple definition of `gSnpSumPedPath'; /tmp/ccNqPloF.o:(.bss+0x20): first defined here
/usr/bin/ld: /tmp/ccNp1FIn.o:(.bss+0x0): multiple definition of `gHasSnpSumPedPath'; /tmp/ccNqPloF.o:(.bss+0x0): first defined here
/usr/bin/ld: /tmp/ccNp1FIn.o:(.bss+0x20): multiple definition of `gSnpSumPedPath'; /tmp/ccNqPloF.o:(.bss+0x20): first defined here
/usr/bin/ld: /tmp/ccbyE1VZ.o:(.bss+0x0): multiple definition of `gHasSnpSumPedPath'; /tmp/ccNqPloF.o:(.bss+0x0): first defined here
/usr/bin/ld: /tmp/ccbyE1VZ.o:(.bss+0x20): multiple definition of `gSnpSumPedPath'; /tmp/ccNqPloF.o:(.bss+0x20): first defined here
/usr/bin/ld: /tmp/ccC37iJn.o:(.bss+0x64): multiple definition of `gHasSnpSumPedPath'; /tmp/ccNqPloF.o:(.bss+0x0): first defined here
/usr/bin/ld: /tmp/ccC37iJn.o:(.bss+0x80): multiple definition of `gSnpSumPedPath'; /tmp/ccNqPloF.o:(.bss+0x20): first defined here
collect2: error: ld returned 1 exit status

My fix was to add the flag "-fcommon" to the gcc command, and it now compiles successfully. I've ran the tests, and they almost all pass.

hzz0024 commented 1 year ago

I have the same issue with compiling and the bug was fixed after adding the "-fcommon" to gcc command. However, I am facing another problem when I ran the test. See error message below,

./run_all_tests.sh

STARTING TEST IN DIRECTORY input7

DATA HAVE BEEN READ.  SUMMARIES APPEAR IN:  snppit_output_BasicDataSummary.txt

COMPUTING AN APPROPRIATE S-MAX
Compiling trio type probabilities for 9 parental collections
Error Processing Option --xml-pream!   Incorrect number of arguments (2 instead of 1) to option --xml-pream
../run_test.sh: line 7: 19617 Segmentation fault      (core dumped) $bin -f datafile.txt $(cat pars)
done running program

Comparing output files for consistency to previous results under Darwin

cmp: snppit_output_ChosenSMAXes.txt: No such file or directory
cmp: snppit_output_FDR_Summary.txt: No such file or directory
cmp: snppit_output_ParentageAssignments.txt: No such file or directory
snppit_output_PopSizesAnPiVectors.txt Consistent
cmp: snppit_output_TrioPosteriors.txt: No such file or directory

Comparing output files for consistency to previous results under Linux

cmp: snppit_output_ChosenSMAXes.txt: No such file or directory
cmp: snppit_output_FDR_Summary.txt: No such file or directory
cmp: snppit_output_ParentageAssignments.txt: No such file or directory
snppit_output_PopSizesAnPiVectors.txt Consistent
cmp: snppit_output_TrioPosteriors.txt: No such file or directory
DONE WITH TEST IN DIRECTORY input7

This is quite annoying and I could not find solution towards the --xml-pream options. Any suggestion would be appreciated. I am using Ubuntu 23.04, while older version of Ubuntu works perfectly. I regret to do the update.

pseudogene commented 7 months ago

Same issue is Ubuntu 22.04 / 24.04... Works perfectly and recompile in MacOS M2... so it is a nice tool to toy with data, but it is useless to actually process large amount (real world) of data.

eriqande commented 7 months ago

Hi Y'all,

I have modified the code to eliminate the multiple definition of gSnpSumPedPath errors, so the -fcommon option to gcc should no longer be needed

Sorry to hear about the --xml-pream issue you are having on Ubuntu. I have not been able to reproduce that problem on the Linux machines I have access to:

SEDNA cluster at NMFS/NWFSC

OS

(base) [sedna: test]--% cat  /etc/*release
CentOS Linux release 8.2.2004 (Core)
NAME="CentOS Linux"
VERSION="8 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="CentOS Linux 8 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:8"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-8"
CENTOS_MANTISBT_PROJECT_VERSION="8"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="8"

CentOS Linux release 8.2.2004 (Core)
CentOS Linux release 8.2.2004 (Core)

CPU

(base) [sedna: test]--% lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              20
On-line CPU(s) list: 0-19
Thread(s) per core:  1
Core(s) per socket:  10
Socket(s):           2
NUMA node(s):        2
Vendor ID:           GenuineIntel
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz
Stepping:            7
CPU MHz:             2476.652
CPU max MHz:         2201.0000
CPU min MHz:         1000.0000
BogoMIPS:            4400.00
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            14080K
NUMA node0 CPU(s):   0-9
NUMA node1 CPU(s):   10-19
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req pku ospke avx512_vnni md_clear flush_l1d arch_capabilities

GCC 8.3.1

(base) [sedna: test]--% gcc --version
gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5)
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Outcome

Compiles with one warning:

src/pbt_C_fb.c: In function ‘ForwardStep’:
src/pbt_C_fb.c:250:34: warning: argument 1 range [18446744071562067968, 18446744073709551615] exceeds maximum object size 9223372036854775807 [-Walloc-size-larger-than=]
   ret->FB[t] = (struct fb_cell *)calloc(ret->Kt[t], sizeof(struct fb_cell));
                                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from src/pbt_C_fb.c:4:
/usr/include/stdlib.h:541:14: note: in a call to allocation function ‘calloc’ declared here
 extern void *calloc (size_t __nmemb, size_t __size)
              ^~~~~~

This is likely a compiler issue. It is complaining about a value that the compiler can't even know---it is set at run time.

The tests were Consistent with the standards minus some sorting differences likely due to updated compiler. I reset the standards for the newer compiler and get consistency between tests:

Comparing output files for consistency to previous results under Linux

snppit_output_ChosenSMAXes.txt Consistent
snppit_output_FDR_Summary.txt Consistent
snppit_output_ParentageAssignments.txt Consistent
snppit_output_PopSizesAnPiVectors.txt Consistent
snppit_output_TrioPosteriors.txt Consistent
DONE WITH TEST IN DIRECTORY input7

ALPINE supercomputer at CU Boulder

OS

(base) [c3cpu-c11-u32-1: snppit]--% cat /etc/*release
production stateful
NAME="Red Hat Enterprise Linux"
VERSION="8.4 (Ootpa)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="8.4"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Red Hat Enterprise Linux 8.4 (Ootpa)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:8.4:GA"
HOME_URL="https://www.redhat.com/"
DOCUMENTATION_URL="https://access.redhat.com/documentation/red_hat_enterprise_linux/8/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_BUGZILLA_PRODUCT_VERSION=8.4
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8.4"
Red Hat Enterprise Linux release 8.4 (Ootpa)
Red Hat Enterprise Linux release 8.4 (Ootpa)

CPU

(base) [c3cpu-c11-u32-1: snppit]--% lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              32
On-line CPU(s) list: 0-31
Thread(s) per core:  1
Core(s) per socket:  16
Socket(s):           2
NUMA node(s):        2
Vendor ID:           AuthenticAMD
CPU family:          25
Model:               1
Model name:          AMD EPYC 7313 16-Core Processor
Stepping:            1
CPU MHz:             2922.655
BogoMIPS:            5988.45
Virtualization:      AMD-V
L1d cache:           32K
L1i cache:           32K
L2 cache:            512K
L3 cache:            32768K
NUMA node0 CPU(s):   0-15
NUMA node1 CPU(s):   16-31
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 invpcid_single hw_pstate sme ssbd mba sev ibrs ibpb stibp vmmcall sev_es fsgsbase bmi1 avx2 smep bmi2 invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr wbnoinvd amd_ppin arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload vgif umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca

Pre-Compiled Version

Using the binary compiled on SEDNA:

Outcome

Consistent:

Comparing output files for consistency to previous results under Linux

snppit_output_ChosenSMAXes.txt Consistent
snppit_output_FDR_Summary.txt Consistent
snppit_output_ParentageAssignments.txt Consistent
snppit_output_PopSizesAnPiVectors.txt Consistent
snppit_output_TrioPosteriors.txt Consistent
DONE WITH TEST IN DIRECTORY input7

GCC 8.5.0

gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-16)
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Same warning on compilation as before.

All tests consistent.

GCC 11.2.0

(base) [c3cpu-c11-u32-1: snppit]--% gcc --version
gcc (GCC) 11.2.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Three warnings similar to the one with gcc 8:

Compiling up executable snppit-Linux
src/pbt_C_fb.c: In function ‘ForwardStep’:
src/pbt_C_fb.c:250:48: warning: argument 1 range [18446744071562067968, 18446744073709551615] exceeds maximum object size 9223372036854775807 [-Walloc-size-larger-than=]
  250 |                 ret->FB[t] = (struct fb_cell *)calloc(ret->Kt[t], sizeof(struct fb_cell));
      |                                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from src/pbt_C_fb.c:4:
/usr/include/stdlib.h:541:14: note: in a call to allocation function ‘calloc’ declared here
  541 | extern void *calloc (size_t __nmemb, size_t __size)
      |              ^~~~~~
src/pbt_C_fb.c: In function ‘Get_pbt_C_fb_Opts’:
src/pbt_C_fb.c:1251:70: warning: argument 1 range [18446744071562067968, 18446744073709551615] exceeds maximum object size 9223372036854775807 [-Walloc-size-larger-than=]
 1251 |                                         ret->RP->Ystates[i] = (int *)calloc(ret->RP->NY,sizeof(int));
      |                                                                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from src/pbt_C_fb.c:4:
/usr/include/stdlib.h:541:14: note: in a call to allocation function ‘calloc’ declared here
  541 | extern void *calloc (size_t __nmemb, size_t __size)
      |              ^~~~~~
src/pfr_read_genos.c: In function ‘ComputeAlleleFreqsFromCounts’:
src/pfr_read_genos.c:1433:46: warning: argument 1 range [18446744071562067968, 18446744073709551615] exceeds maximum object size 9223372036854775807 [-Walloc-size-larger-than=]
 1433 |                 P->AlleFreqs[i] = (double **)calloc(P->NumLoci,sizeof(double *));
      |                                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from src/pfr_read_genos.c:3:
/usr/include/stdlib.h:541:14: note: in a call to allocation function ‘calloc’ declared here
  541 | extern void *calloc (size_t __nmemb, size_t __size)
      |              ^~~~~~

All tests were consistent.

GCC 13.2.0

Same warning, but just one of them.

All tests consistent.

eriqande commented 7 months ago

@ckastall and @hzz0024, I am so sorry I didn't get those multiple definition of gSnpSumPedPath errors straightened out earlier.

When I made it back to the snppit source code this morning I had a handful of unstaged changes that fixed it, but I guess I didn't manage to push it up right after I had made those changes months ago.

I'm not sure what is going on with Ubuntu. You might try the pre-compiled Linux binary that is in the repo now: snppit-Linux. That might work for you. (?)

eriqande commented 7 months ago

@pseudogene, I am curious to learn more about the scale of the data that you are trying to analyze with snppit.

How many individuals and how many markers are you using?

We have had no trouble building a 4- to 5-generation pedigree with >18,000 sea-run steelhead trout, running on a grad student's ancient Mac laptop. (See https://doi.org/10.1111/mec.17182).

Are you trying to use thousands of SNPs? That could make things difficult for snppit, especially if missing data rates and genotyping error rates are not well controlled. snppit was developed to do parentage with the small number of SNPs available in 2009 or so---leveraging the fact that greater accuracy with small numbers of SNPs was available by doing the inference on a trio-wise basis. I don't know if you were able to get a hold of the original paper about snppit---sometimes it is difficult to access it, since deGruyter (the publisher) seems to have gone through several cycles in the last decade. I've attached it here here in case you want to read it. The second full paragraph on page 22 explains some of the issues with increasing numbers of SNPs.

The good news is that, if you have thousands of SNPs, you can do parentage pretty readily just using pairs, and not necessarily have to resort to trios. An R package I have for doing pairwise relationship inference, CKMRsim, might be helpful for you.